docs/diploma

changeset 360:056a353b9116

a lot of rework (queue-in daemon/no daemon); and much more rework
author meillo@marmaro.de
date Wed, 28 Jan 2009 15:01:35 +0100
parents 959eeedf6eaf
children f75efd59fefd
files thesis/tex/5-Improvements.tex
diffstat 1 files changed, 44 insertions(+), 62 deletions(-) [+]
line diff
     1.1 --- a/thesis/tex/5-Improvements.tex	Wed Jan 28 15:00:43 2009 +0100
     1.2 +++ b/thesis/tex/5-Improvements.tex	Wed Jan 28 15:01:35 2009 +0100
     1.3 @@ -65,7 +65,7 @@
     1.4  
     1.5  These days is \NAME{SMTP-AUTH}---defined in \RFC\,2554---supported by most email clients. If encryption is used then even insecure authentication methods like \NAME{PLAIN} and \NAME{LOGIN} become secure.
     1.6  
     1.7 -\subsubsection*{SASL}
     1.8 +\subsubsection*{Simple Authentication and Security Layer}
     1.9  
    1.10  \masqmail\ best uses an available \NAME{SASL} library. \name{Cyrus} \NAME{SASL} is used by \postfix\ and \sendmail. It is a complete framework that makes use of existing authentication concepts like \path{/etc/passwd} or \NAME{PAM}. As advantage it can be included in existing user data bases. \name{gsasl} is an alternative. It comes as a library which helps on deciding for a method and on generating the appropriate dialog data; the actual transmission of the data and the authentication against some database is left open to the programmer. \name{gsasl} is used by \name{msmtp} for example. It seems best to give both concepts a try and decide then which one to use.
    1.11  
    1.12 @@ -155,13 +155,12 @@
    1.13  
    1.14  \subsection{Design decisions}
    1.15  
    1.16 -This section describes and discusses architectural decision that were made for the new design. To functional requirements is refered to, as they were already identified in chapter \ref{chap:present-and-future}. %fixme: At some points function is of matter too, but it is mostly about architecture.
    1.17 +This section describes and discusses architectural decision that were made for the new design. To functional requirements is in most times only refered as they were already discussed in chapter \ref{chap:present-and-future}.
    1.18  
    1.19  A number of major design ideas lead the development of the new architecture:
    1.20  \begin{enumerate}
    1.21 -	\item compartmentalization throughout
    1.22 -	\item free the internal system from the in and out channels
    1.23 -	\item provide interfaces to add arbitrary protocol handlers afterwards
    1.24 +	\item throughout compartmentalization
    1.25 +	\item free the internal system from the in and out channels; provide interfaces to add arbitrary protocol handlers afterwards
    1.26  	\item have a single point where all mail goes through for scanning
    1.27  	\item concentrate on the mail transfer job; use specialized external programs for other jobs
    1.28  	\item keep it simple, clear, and general
    1.29 @@ -172,14 +171,14 @@
    1.30  
    1.31  \subsubsection*{Incoming channels}
    1.32  
    1.33 -The functional requirements were already discussed as \RF\,1 on page \pageref{rf1}. At least two incoming channels were identified: the \path{sendmail} command for local mail submission and the \SMTP\ daemon for remote connections.
    1.34 +The functional requirements for incoming and outgoing channels were already discussed as \RF\,1 on page \pageref{rf1}. Two required incoming channels were identified: the \path{sendmail} command for local mail submission and the \SMTP\ daemon for remote connections.
    1.35  
    1.36  A bit different is the structure of \name{sendmail X} at that point: Locally submitted messages go to the \SMTP\ daemon, which is the only connection towards the mail queue. %fixme: is it a smtp dialog? or a back door?
    1.37  \person{Finch} proposes a similar approach \cite{finch-sendmail}. He wants the \texttt{sendmail} command to be a simple \SMTP\ client that contacts the \SMTP\ daemon of the \MTA\ like it is done by connections from remote. The advantage here is one single module where all \SMTP\ dialog with submitters is done. Hence one single point to accept or refuse incoming mail. Additionally does the module which puts mail into the queue not need to be \name{setuid} or \name{setgid} because it is only invoked from the \SMTP\ daemon. The \MTA's architecture would become simpler and common tasks are not duplicated in modules that do similar jobs.
    1.38  
    1.39 -But merging the input channels in the \SMTP\ daemon makes the \MTA\ heavily dependent on \SMTP. To \qmail\ and \postfix\ new modules to support other ways of message reception may be added without change of other parts of the system. Also the \SMTP\ modules can be removed if it is not needed. And it is better to have more independent modules if each one is simpler then---it makes the modules more complicated if each one needs to implement an \SMTP\ client.
    1.40 +But merging the input channels in the \SMTP\ daemon makes the \MTA\ heavily dependent on \SMTP. To \qmail\ and \postfix\ new protocol handlers may be added without change in other parts of the system. Also the \SMTP\ modules can be removed if it is not needed. It is better to have more independent modules if each one is simpler then. The need to implement an \SMTP\ client in each one makes the modules more complicated.
    1.41  
    1.42 -With the increasing need for new protocols in mind, it seems better to have single modules for each incoming channel, although this leads to duplicated acceptance checks. Independent checks in different modules, however, have also the advantage to simply apply different policies. Thus it is possible to run two \SMTP\ modules that listen on different ports; one accessible from the Internet but requires authentication, the other only accessible from the local network but does not require authentication.
    1.43 +With the increasing need for new protocols in mind, it seems better to have single modules for each incoming channel, although this leads to duplicated acceptance checks. Independent checks in different modules, however, have the advantage of simply applying different policies. Thus it is possible to run two \SMTP\ modules that listen on different ports: one accessible from the Internet which requires authentication, the other one only accessible from the local network without authentication.
    1.44  
    1.45  The approach of simple independent modules, one for each incoming channel, should be taken.
    1.46  
    1.47 @@ -191,11 +190,11 @@
    1.48  
    1.49  Outgoing mail is commonly either sent using \SMTP, piped into local commands (for example \path{uucp}), or delivered locally by appending to a mailbox.
    1.50  
    1.51 -Outgoing channels are similar for \qmail, \postfix, and \name{sendmail X}: All of them have a module to send mail using \SMTP, and one for writing into a local mailbox. Local mail delivery is a job that requires root privilege to be able to switch to any user in order to write to his mailbox. Modular \MTA{}s do not need \name{setuid root}, but the local delivery process (or its parent) needs to run as root\footnote{root privilege is actually not a mandatory requirement, but any other approach has some disadvantages, so commonly root privilege is used.}.
    1.52 +Outgoing channels are similar for \qmail, \postfix, and \name{sendmail X}: All of them have a module to send mail using \SMTP\ and one for writing into a local mailbox. Local mail delivery is a job that should have root privilege to be able to switch to any user in order to write to his mailbox. Modular \MTA{}s do not require \name{setuid root} but the local delivery process (or its parent) should run as root. root privilege is not a mandatory requirement but any other approach has some disadvantages thus commonly root privilege is used.
    1.53  
    1.54 -Local mail delivery should not be done by the \MTA, but by an \NAME{MDA}. This decision was discussed in section \ref{sec:functional-requirements}. This means only an outgoing channel that pipes mail into a local command is required for local delivery.
    1.55 +Local mail delivery should not be done by the \MTA, but by an \NAME{MDA} instead. This decision was discussed in section \ref{sec:functional-requirements}. This means only an outgoing channel that pipes mail into a local command is required for local delivery.
    1.56  
    1.57 -Other outgoing channels, one for each supported protocol, may be designed like it was done in other \MTA{}s.
    1.58 +Other outgoing channels, one for each supported protocol, should be designed like it was done in other \MTA{}s.
    1.59  
    1.60  
    1.61  
    1.62 @@ -211,7 +210,7 @@
    1.63  
    1.64  All of the presented \MTA{}s use the file system to hold the queue; none uses a database to hold it. A database could improve the reliability of the queue through better persistence. This might be a choice for larger \MTA{}s but is none for \masqmail\ which should be kept small and simple. A running database system does likely require much more resources than \masqmail\ itself does. And as the queue's job is more storing data than running data selection queries, a database does not gain so much that it outweighs its costs.
    1.65  
    1.66 -Hence here the choice is having a directory with simple text files in it. This is straight forward, simple, clear, and general \dots\ and thus a good basis for reliability. It is additionally always of advantage if data is stored in the operation system's natural form, which in the case of \unix\ is plain text.
    1.67 +Hence here the choice is having a directory with simple text files in it. This is straight forward, simple, clear, and general \dots\ and thus a good basis for reliability. It is additionally always of advantage if data is stored in the operating system's natural form, which in the case of \unix\ is plain text.
    1.68  
    1.69  Robustness for the queue is covered in the next section. %fixme: ist this sentence neccessary? Is it still correct.
    1.70  
    1.71 @@ -219,15 +218,15 @@
    1.72  
    1.73  \subsubsection*{Mail sanitizing}
    1.74  
    1.75 -Mail coming into the system may be may be malformed, lacking headers, or be an attempt to exploit the system. Care must be taken.
    1.76 +Mail coming into the system may be malformed, lacking headers, or can be an attempt to exploit the system. Care must be taken.
    1.77  
    1.78  In \postfix, this is done by the \name{cleanup} module, which invokes \name{rewrite}. The position in the message flow is after the message comes from one of the several incoming channels and before the message is stored into the \name{incoming} queue. \name{cleanup} does a complete check to make the mail header complete and valid.
    1.79  
    1.80 -\qmail\ has the principle of ``don't parse'' which propagates the avoidance of parsing as possible in the system. The reason is that parsing is a highly complex task which often makes code exploitable.
    1.81 +\qmail\ has the principle of ``don't parse'' which propagates the avoidance of parsing as much as possible. The reason is that parsing is a highly complex task which often makes code exploitable.
    1.82  
    1.83 -Mail should be stored into the queue as it is in \masqmail's new design. A scanning module should then parse the message with high care. It seems best to use a \name{parser generator} for this work. The parsed data should then be modified if needed and written into a second queue. This approach has several advantages. First, the receiving parts of the system do not bother about content, they simply store it into the queue. Second, one single modules does the parsing and generates new messages that contain only valid data. Third, the sending parts of the system will only work on messages that consist of valid data. Of course it must be ensured that each message passes through the \name{scanning} module, but this is required for spam and malware scanning too.
    1.84 +In \masqmail's new design, mail should be stored into the queue without parsing. A scanning module should then parse the message with high care. It seems best to use a \name{parser generator} for this work. The parsed data should then be modified if needed and written into a second queue. This approach has several advantages. First, the receiving parts of the system do not bother about content, they simply store it into the queue. Second, one single modules does the parsing and generates new messages that contain only valid data. Third, the sending parts of the system will only work on messages that consist of valid data. Of course it must be ensured that each message passes through the \name{scanning} module but this is already required for spam and malware scanning.
    1.85  
    1.86 -The mail body will never get modified, except of removing and adding transfer protocol specific requirements like dot stuffing or special line ending characters.
    1.87 +The mail body will never get modified, except of removing and adding transfer protocol specific requirements like dot stuffing or special line ending characters. These translations are only done in receiving and sending modules.
    1.88  
    1.89  \person{Jon Postel}'s robustness principle ``Be liberal in what you accept, and conservative in what you send.'', which can be found in this wording in \RFC\,1122 and in different wordings in numerous \RFC{}s, is respected in the \name{scanning} module. It parses the given input in some liberal way and generates clean output. \person{Raymond}'s \name{Rule of Repair} ``Repair what you can -- but when you must fail, fail noisily and as soon as possible.'' can be applied too. But it is important to repair only obvious problems, because repairing functionality is likely a target of attacks.
    1.90  
    1.91 @@ -242,14 +241,14 @@
    1.92  
    1.93  Aliasing is often handled in expanding the alias and re-injecting the mail into the system. Unfortunately, the mail is processed twice then; additionally does the system have to handle more mail this way. If it is wanted to check the new recipient address for acceptance and do all processing again, then re-injecting it is the best choice. But already accepted messages may get rejected in the second go, because of an replacement address from within the system. This seems not to be wanted.
    1.94  
    1.95 -Doing the alias expansion in the scanning module appears to be the best solution. Unfortunately a second alias expansion must be made on delivery, because only at that point in time is clear which route is used for the message. This compromise is accepted.
    1.96 +Doing the alias expansion in the scanning module appears to be the best solution. Unfortunately a second alias expansion must be made on delivery, because only then is clear which route is used for the message. This compromise is accepted.
    1.97  
    1.98  
    1.99  
   1.100  \subsubsection*{Route management}
   1.101  
   1.102  The online state is only important for the sending modules of the system, thus it should be queried in the \name{queue-out} module which selects ready messages from the \name{outgoing} queue and transfers them to the appropriate sending module. Route-based aliasing, which was described in the last section, %fixme: is this still true?
   1.103 -should to be done in the same go.
   1.104 +should be done in the same go.
   1.105  
   1.106  
   1.107  
   1.108 @@ -271,7 +270,7 @@
   1.109  
   1.110  Authentication should be done within the receiving modules. Similar should authentication for outgoing connections be handled by the sending modules. To encryption applies the same as to authentication here. Only receiving and sending modules should come in contact with it.
   1.111  
   1.112 -In order to avoid code duplicates, the actual implementation of both functions should be provided by a central source which gets invoked by the various modules.
   1.113 +In order to avoid code duplicates, the actual implementation of both functions should be provided by a central source which is used by the various modules.
   1.114  
   1.115  
   1.116  
   1.117 @@ -283,27 +282,27 @@
   1.118  The two approaches for spam handling were already presented to the reader in section \ref{sec:functional-requirements} as \RF\,8 and \RF\,9. Here they are described in more detail:
   1.119  
   1.120  \begin{enumerate}
   1.121 -\item Refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and recipient mail addresses would be enough, but as they are forgeable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time can be used, during the \SMTP\ dialog, for checking if a message seems to be spam. The advantage is that acceptance of bad messages can be simply refused---no responsibility for the message is taken and no further system load is added. See \RFC2505 (especially section 1.5) for detail.
   1.122 +\item Refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and recipient mail addresses would be enough, but as they are forgeable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time during the \SMTP\ dialog can be used for checking if a message seems to be spam. The advantage is that bad messages can simply get refused---no responsibility for the message is taken and no further system load is added. See \RFC2505 (especially section 1.5) for detail.
   1.123  
   1.124 -\item Checking for spam after the mail was accepted and queued. Here more processing time can be invested, so more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existent header lines. Thus all further work on the message is the same as for non-spam messages.
   1.125 +\item Checking for spam after the mail was accepted and queued. Here more processing time can be invested, thus more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existing header lines. Thus all further work on the message is the same as for non-spam messages.
   1.126  \end{enumerate}
   1.127  
   1.128 -Modern \MTA{}s use both techniques in combination. Checks during the \SMTP\ dialog tend to be implemented in the \mta\ to make it fast; checks after the message was queued are often done using external programs (\name{spamassassin} is a well known one). \person{Eisentraut} sees the checks during the \SMTP\ dialog to be essential: ``Ganz ohne Analyse w\"ahrend der \SMTP-Phase kommt sowieso kein \MTA\ aus, und es ist eine Frage der Einsch\"atzung, wie weit man diese Phase belasten m\"ochte.'' \cite[page 25]{eisentraut05} (translated: ``No \MTA\ can go without analysis during the \SMTP\ phase anyway, but the amount of stress one likes to put on this phase is left to his discretion.'')
   1.129 +Modern \MTA{}s use both techniques in combination. Checks during the \SMTP\ dialog tend to be implemented in the \mta\ to make them fast; checks after the message was queued are often done using external programs (\name{spamassassin} is a well known one). \person{Eisentraut} sees the checks during the \SMTP\ dialog to be essential: ``Ganz ohne Analyse w\"ahrend der \SMTP-Phase kommt sowieso kein \MTA\ aus, und es ist eine Frage der Einsch\"atzung, wie weit man diese Phase belasten m\"ochte.'' \cite[page 25]{eisentraut05} (translated: ``No \MTA\ can go without analysis during the \SMTP\ phase anyway, but the amount of stress one likes to put on this phase is left to his discretion.'')
   1.130  
   1.131  Checking before a message is accepted, like \NAME{DNS} blacklists and \name{greylisting}, needs to be invoked from within the receiving modules. Like for authentication and encryption, the implementation of the functionality should be provided by a central source.
   1.132  
   1.133  All checking after the message was queued should be done by pushing the message through external scanners like \name{spamassassin}. The \name{scanning} module is the best place to handle this. Hence this module needs interfaces to external scanners.
   1.134  
   1.135  
   1.136 -Malware scanning is similar like the second type of spam scanning. The \name{amavis} framework is a popular mail scanning framework that includes all kinds of malware and also spam scanners; it communicates by using \SMTP.
   1.137 +Malware scanning is similar to spam scanning of queued messages. The \name{amavis} framework is a popular mail scanning framework that includes all kinds of malware and also spam scanners; it communicates by using \SMTP.
   1.138  
   1.139 -Providing \SMTP\ in and out channels from the \name{scanning} module to external scanner applications seems to be a desired goal. Using further instances of the already available \name{smtp} and \name{smtpd} modules therefore appears to be the best solution.
   1.140 +Providing \SMTP\ in and out channels from the \name{scanning} module to external scanner applications is a desired goal. Using further instances of the already available \name{smtp} and \name{smtpd} modules therefore appears to be the best solution.
   1.141  
   1.142  
   1.143  
   1.144  \subsubsection*{The scanning module}
   1.145  
   1.146 -A problem, which gets probably noticed by a attentive reader, is the lot of work that was put onto the \name{scanning} module. This is not what is desired. Thus splitting this module into a set of single modules appears to be necessary.
   1.147 +A problem, which was probably noticed by the attentive reader, is the lot of work that was put onto the \name{scanning} module. This is not what is desired. Thus splitting this module into a set of single modules appears to be necessary.
   1.148  
   1.149  The decision how to split shall not be discussed here. It is left up to the time of prototyping, because trying different approaches is good in such situations.
   1.150  
   1.151 @@ -322,7 +321,7 @@
   1.152  
   1.153  \subsection{The resulting architecture}
   1.154  
   1.155 -The result is a symmetric design, featuring the following parts: Any number of handlers for incoming connections to receive mail. A module that stores the received mail into the incoming queue. A central scanning module take mail from the incoming queue, processes it in various ways, and puts it afterwards into the outgoing queue. A module that takes it out of the outgoing queue and passes it to a matching transport module. A set of transport modules that transfers the message to the destination. In other words three main modules (queue-in, scanning, queue-out) are connected by two queues (incoming, outgoing). On each end are more modules to receive or send mail---one for each protocol. The \name{pool} is the place where the bodies of the queued messages are stored. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
   1.156 +The result is a symmetric design, featuring the following parts: Any number of handlers for incoming connections to receive mail. A module that stores the received mail into a first queue. A central scanning module take mail from the first queue, processes it in various ways, and puts it afterwards into a second queue. A module that takes it out of the second queue and passes it to a matching transport module. A set of transport modules that transfers the message to the destination. In other words three main modules (\name{queue-in}, \name{scanning}, \name{queue-out}) are connected by two queues (\name{incoming}, \name{outgoing}). On each end is a set of modules to receive or send mail---one for each protocol. The \name{pool} is part of the queue; it is the place where the bodies of the queued messages are stored. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
   1.157  
   1.158  \begin{figure}
   1.159  	\begin{center}
   1.160 @@ -335,62 +334,45 @@
   1.161  This architecture is heavily influenced by the ones of \qmail\ and \postfix. Both have different incoming channels that merge in the module that puts mail into the queue; central is the queue (or more of them); and one module takes mail from the queue and passes it to one of the outgoing channels. Mail processing is built into the architecture in a more explicit way than it was done in \qmail\ and \postfix. It is more similar to the \NAME{AR} module of \name{sendmail X}, which is the central point for spam checking.
   1.162  
   1.163  Special regard was put on addable support for further mail transfer protocols. This appears to be most similar to \qmail, which was designed to handle multiple protocols.
   1.164 -%fixme: do i need all this ``quesses''??
   1.165  
   1.166  
   1.167  \subsubsection*{The modules}
   1.168  
   1.169 -The new architecture consists of several modules. They are described in more detail now. First the three main modules, afterwards the modules for incoming and outgoing transfer.
   1.170 +The new architecture consists of several modules which are described in more detail now. First the three main modules afterwards the modules for incoming and outgoing transfer.
   1.171  
   1.172  
   1.173 -The \name{queue-in} module creates new spool files in the \name{incoming} queue and in the message \name{pool} for incoming messages. It is a process in background, waiting for connections from one of the receiver modules. When one of them is receiving a new message, it connects to the \name{queue-in} module which creates a spool file in the \name{incoming} queue and a message body file in the \name{pool} and returns success. The receiver module then sends the envelope, the message header, and the message body. The first two get written into the spool file by \name{queue-in}, the latter is stored into the \name{pool}. If all went well another positive result is returned.
   1.174 -%fixme: daemon or no daemon?
   1.175 +The \name{queue-in} module creates new spool files for incoming messages in the \name{incoming} queue and stores their bodies into the \name{pool}. When one of the receiving modules has a new message, it invokes the \name{queue-in} module which creates a spool file in the \name{incoming} queue and a data file in the \name{pool} and returns success. The receiver module then sends the envelope, the message header, and the message body. The first two get written into the spool file by \name{queue-in}, the latter is stored into the \name{pool}. If all went well another positive result is returned.
   1.176  
   1.177  
   1.178 -The \name{scanning} module is the central part of the system. It reads spooled messages from the \name{incoming} queue, works on the data, and writes new spool files to the \name{outgoing} queue. Then the message is removed from the \name{incoming} queue. The main job of this module is the processing of the message. Headers are fixed and missing ones are added if necessary, aliasing is done, and external processing of any kind is triggered. The \name{scanning} module can run in background and look for new mail in regular intervals or signals may be sent to it by \name{queue-in}. Alternatively it can be called by \name{cron} to do single runs. The \name{scanning} module work on the spool files primary but may read the mail body from the \name{pool} if necessary.
   1.179 +The \name{scanning} module is the central part of the system. It reads spooled messages from the \name{incoming} queue, works on the data, and writes new spool files to the \name{outgoing} queue. Then the message is removed from the \name{incoming} queue. The main job of this module is the processing of the message. Headers are fixed and missing ones are added if necessary, aliasing is done, and external processing of any kind is triggered. The \name{scanning} module can run in background and look for new mail in regular intervals or signals may be sent to it by \name{queue-in}. Alternatively it can be called by \name{cron} to do single runs. The \name{scanning} module processes the spool files primary but may read the mail body from the \name{pool} if necessary.
   1.180  
   1.181  
   1.182  The \name{queue-out} module takes messages from the \name{outgoing} queue, queries information about the online state which specifies the route to use, and passes the messages to the correct transport module. Successfully transferred messages are removed from the \name{outgoing} queue. This module handles the \masqmail\ specific task of the route management.
   1.183  
   1.184  
   1.185 -\name{Receiver modules} are the communication interface between external senders and the \name{queue-in} module. Each protocol needs a corresponding \name{receiver module} to be supported. Most popular are the \name{sendmail} module which is a command to be called from the local host and the \name{smtpd} module which listens on port 25. Other modules to support other protocols may be added as needed. Receiving modules that need to listen on ports should get invoked by \name{inetd} or a more secure replacement like \person{Bernstein}'s \name{ucspi-tcp}. This makes it possible to run them with least privilege.
   1.186 +\name{Receiver modules} are the communication interface between external senders and the \name{queue-in} module. Each protocol needs a corresponding \name{receiver module} to be supported. Most popular is the \name{sendmail} module which is a command to be called from the local host and the \name{smtpd} module which usually listens on port 25. Other modules to support other protocols may be added as needed. Receiving modules that need to listen on ports should get invoked by \name{inetd} or a more secure replacement like \person{Bernstein}'s \name{ucspi-tcp}. This makes it possible to run them with least privilege.
   1.187  
   1.188 -\name{Transport modules}, on the opposite side of the system, are the modules to send outgoing mail. They are the interface between \name{queue-out} and remote hosts or local commands for further processing. The most popular ones are the \name{smtp} module which acts as the \SMTP\ client and the \name{pipe} module to interface gateways to other systems or networks, like fax or \NAME{UUCP}. A module for local delivery is not included, \masqmail\ passes this job to the \NAME{MDA} (see section \ref{sec:functional-requirements} for reasons). Thus a \name{mail delivery agent} (like \name{procmail}) is to be used with the \name{pipe} module.
   1.189 +\name{Transport modules}, on the opposite side of the system, are the modules that send outgoing mail. They are the interface between \name{queue-out} and remote hosts or local commands for further processing. The most popular ones are the \name{smtp} module which acts as an \SMTP\ client and the \name{pipe} module to interface gateways to other systems or networks like fax and \NAME{UUCP}. A module for local delivery is not included, \masqmail\ passes this job to an \NAME{MDA} (like \name{procmail}) (see section \ref{sec:functional-requirements} for reasons). The \NAME{MDA} gets invoked through the \name{pipe} module.
   1.190  
   1.191  
   1.192  
   1.193  
   1.194  \subsubsection*{The queue}
   1.195  
   1.196 -The queue is actually two queues and a data pool. The queues store the spool files---unprocessed in \name{incoming} and in complete and valid form in \name{outgoing}. The \name{pool} is the storage of data files, the message bodies of queued messages.
   1.197 +The queue is actually two queues and a data pool. The queues store the spool files---unprocessed in \name{incoming} and in complete and valid form in \name{outgoing}. The \name{pool} is the storage of data files, the message bodies of queued messages. The three parts are represented by three directories within the queue path on disk.
   1.198  
   1.199 -Three directories within the queue path contain the queue on disk. Two, named \name{incoming} and \name{outgoing}, for storing the spool files; one, called \name{pool}, to store the data files. The files being part of one message share the same unique name. A queued message is represented by a spool file in \name{incoming} or \name{outgoing} and a data file in the \name{pool}.
   1.200 +The representation of queued files on disk is basically the same as the one in current \masqmail: one file for the envelope and message header information (the ``spool file''), a second file for the message body (the ``data file''). The spool file's internal structure of current \masqmail\ can be remain. Following is a sample spool file. (The first part is the envelope with comments in parenthesis; the second part is the message header.)
   1.201  
   1.202 -The spool file owner's executable bit shows if the file is ready for further processing: The module that writes the file into the queue sets the bit as last action. Modules that read from the queue can process messages that have the bit set. This approach is derived from \postfix.
   1.203 -
   1.204 -The spool file's internal structure can remain the same as the one of current \masqmail.
   1.205 -
   1.206 -The spool file format is basically the same as the one in current \masqmail: one file for the envelope and message header information (it is called ``spool file'' here), a second file for the message body (called ``data file'').
   1.207 -
   1.208 -The data file is stored in a separate data pool. It is written by \name{queue-in}; \name{scanning} can read it if necessary; \name{queue-out} reads it to generate the outgoing message and deletes it after successful transfer. Data files do not change at all within the system. They are written in default local plain text format. Required translation is done in the receiver and transport modules.
   1.209 +\codeinput{input/sample-spool-file.txt}
   1.210  
   1.211  The spool file is written into the \name{incoming} queue. The \name{scanning} modules reads it, processes it, and writes a new one into the \name{outgoing} queue; the file in \name{incoming} is deleted then. \name{queue-out} finally takes the spool file from \name{outgoing} and the data file from the \name{pool} to generate the resulting message.
   1.212  
   1.213 -%This data flow is shown in figure \ref{fig:queue-data-flow}.
   1.214 -%
   1.215 -%\begin{figure}
   1.216 -%	\begin{center}
   1.217 -%		%\input{img/queue-data-flow.eps}
   1.218 -%	\end{center}
   1.219 -%	\caption{Data flow of messages in the queue}
   1.220 -%	\label{fig:queue-data-flow}
   1.221 -%\end{figure}
   1.222 +The spool file owner's executable bit shows if a file is ready for further processing: The module that writes the file into the queue sets the bit as last action. Modules that read from the queue can process messages that have the bit set. This approach is derived from \postfix.
   1.223  
   1.224 -%The \name{incoming} queue stores envelope and the message header of messages received via one of the incoming channels. The data is in unprocessed form. The \name{outgoing} queue contains processed data. The header and envelope information is complete and in valid form. The \name{pool} is the storage of the message bodies of queued messages. This data is not changed within the \MTA, it is written on reception and read on dispatch.
   1.225 +The data file is stored in a separate data pool. It is written by \name{queue-in}; \name{scanning} can read it if necessary; \name{queue-out} reads it to generate the outgoing message and deletes it after successful transfer. Data files do not change at all within the system. They are written in default local text format. Required translation is done in the receiver and transport modules.
   1.226  
   1.227  
   1.228 -A sample spool file. With comments in parenthesis.
   1.229 -\codeinput{input/sample-spool-file.txt}
   1.230 +
   1.231  
   1.232  
   1.233  
   1.234 @@ -405,6 +387,14 @@
   1.235  
   1.236  Left is only communication between the receiver modules and \name{queue-in}, and between \name{queue-out} and the transport modules. Data is exchanged using \unix\ pipes and a simple protocol. Figure \ref{fig:ipc-protocol} shows a state diagram for the protocol. Solid lines indicate client actions, dashed lines indicate server responses.
   1.237  
   1.238 +\begin{figure}
   1.239 +	\begin{center}
   1.240 +		\includegraphics[scale=0.75]{img/ipc-protocol.eps}
   1.241 +	\end{center}
   1.242 +	\caption{State diagram of the \NAME{IPC} protocol}
   1.243 +	\label{fig:ipc-protocol}
   1.244 +\end{figure}
   1.245 +
   1.246  \paragraph{Timing}
   1.247  One dialog consists of exactly three phases: connection attempt, envelope and header transfer, and transfer of the message body. The order is always the same. The three phases are all initiated by the client process; after each phase the server process sends a success or error reply. Timeouts for each phase need to be implemented.
   1.248  
   1.249 @@ -413,14 +403,6 @@
   1.250  
   1.251  The client indicates the end of each data transfer with a special terminator sequence. The appearance of this terminator sequence tells the server process that the data transfer is complete and makes the server send a reply. The server process takes responsibility of the data in sending a success reply. A failure reply immediately stops the dialog and resets both client and server to the state before the connection attempt.
   1.252  
   1.253 -\begin{figure}
   1.254 -	\begin{center}
   1.255 -		\includegraphics[scale=0.75]{img/ipc-protocol.eps}
   1.256 -	\end{center}
   1.257 -	\caption{State diagram of the \NAME{IPC} protocol}
   1.258 -	\label{fig:ipc-protocol}
   1.259 -\end{figure}
   1.260 -
   1.261  \paragraph{Syntax}
   1.262  Data transfer is done by sending plain text data. \name{Line Feed}---the native line separator on \unix---is used as line separator. The terminator sequence used to indicate the end of the data transfer is the \NAME{ASCII} \name{null} character (`\texttt{\textbackslash0}'). Replies are one-digit numbers with `\texttt{0}' meaning success and any other number (`\texttt{1}'--`\texttt{9}') indicate failure.
   1.263