docs/diploma

changeset 333:5f416c27e932

rework in the resulting architecture
author meillo@marmaro.de
date Sat, 24 Jan 2009 14:26:59 +0100
parents 4d705f7a956a
children 99e368f07e9a
files thesis/input/sample-spool-file.txt thesis/tex/5-Improvements.tex
diffstat 2 files changed, 44 insertions(+), 48 deletions(-) [+]
line diff
     1.1 --- /dev/null	Thu Jan 01 00:00:00 1970 +0000
     1.2 +++ b/thesis/input/sample-spool-file.txt	Sat Jan 24 14:26:59 2009 +0100
     1.3 @@ -0,0 +1,15 @@
     1.4 +1LGtYh-0ut-00                (backup copy of the file name)
     1.5 +MF:<meillo@dream>            (envelope: sender)
     1.6 +RT: <user@example.org>       (envelope: recipient)
     1.7 +PR:local                     (meta info: protocol)
     1.8 +ID:meillo                    (meta info: id/user/ip)
     1.9 +DS: 18                       (meta info: size)
    1.10 +TR: 1230462707               (meta info: timestamp)
    1.11 +                             (following: headers)
    1.12 +HD:Received: from meillo by dream with local (masqmail 0.2.21) id
    1.13 + 1LGtYh-0ut-00 for <user@example.org>; Sun, 28 Dec 2008 12:11:47 +0100
    1.14 +HD:To: user@example.org
    1.15 +HD:Subject: test mail
    1.16 +HD:From: <meillo@dream>
    1.17 +HD:Date: Sun, 28 Dec 2008 12:11:47 +0100
    1.18 +HD:Message-ID: <1LGtYh-0ut-00@dream>
     2.1 --- a/thesis/tex/5-Improvements.tex	Sat Jan 24 12:32:20 2009 +0100
     2.2 +++ b/thesis/tex/5-Improvements.tex	Sat Jan 24 14:26:59 2009 +0100
     2.3 @@ -328,7 +328,7 @@
     2.4  
     2.5  \subsection{The resulting architecture}
     2.6  
     2.7 -The result is a symmetric design, featuring the following parts: Any number of handlers for incoming connections to receive mail and pass it to the module that stores it into the incoming queue. A central scanning module take mail from the incoming queue, processes it in various ways and puts it afterwards into the outgoing queue. Another module takes it out there and passes it to a matching transport module that transfers it to the destination. In other words, three main modules (queue-in, scanning, queue-out) are connected by the two queues (incoming, outgoing); on each end are more modules to receive and send mail---for each protocol one. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
     2.8 +The result is a symmetric design, featuring the following parts: Any number of handlers for incoming connections to receive mail. A module that stores the received mail into the incoming queue. A central scanning module take mail from the incoming queue, processes it in various ways, and puts it afterwards into the outgoing queue. A module that takes it out of the outgoing queue and passes it to a matching transport module. A set of transport modules that transfers the message to the destination. In other words three main modules (queue-in, scanning, queue-out) are connected by two queues (incoming, outgoing). On each end are more modules to receive or send mail---one for each protocol. The \name{pool} is the place where the bodies of the queued messages are stored. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
     2.9  
    2.10  \begin{figure}
    2.11  	\begin{center}
    2.12 @@ -338,7 +338,7 @@
    2.13  	\label{fig:masqmail-arch-new}
    2.14  \end{figure}
    2.15  
    2.16 -This architecture is heavily influenced by the ones of \qmail\ and \postfix. Both have different incoming channels that merge in the module that puts mail into the queue; central is the queue (or more of them); and one module takes mail from the queue and passes it to one of the outgoing channels. Mail processing, in any way, is build in in a more explicit way than done in the other two. It is more similar to the \NAME{AR} module of \name{sendmail X}, which is the central point for spam checking.
    2.17 +This architecture is heavily influenced by the ones of \qmail\ and \postfix. Both have different incoming channels that merge in the module that puts mail into the queue; central is the queue (or more of them); and one module takes mail from the queue and passes it to one of the outgoing channels. Mail processing is built into the architecture in a more explicit way than it was done in \qmail\ and \postfix. It is more similar to the \NAME{AR} module of \name{sendmail X}, which is the central point for spam checking.
    2.18  
    2.19  Special regard was put on addable support for further mail transfer protocols. This appears to be most similar to \qmail, which was designed to handle multiple protocols.
    2.20  %fixme: do i need all this ``quesses''??
    2.21 @@ -346,29 +346,29 @@
    2.22  
    2.23  \subsubsection*{Modules and queues}
    2.24  
    2.25 -The new architecture consists of several modules and two queues. They are defined in more detail now, and the jobs, identified above, are assigned to them. First the three main modules, then the queues, and afterwards the modules for incoming and outgoing transfer.
    2.26 +The new architecture consists of several modules and two queues plus a data pool. They are described in more detail now. First the three main modules, then the queues, and afterwards the modules for incoming and outgoing transfer.
    2.27  
    2.28  
    2.29 -The \name{queue-in} module creates new spool files in the \name{incoming} queue for incoming messages. It is a process running in background, waiting for connections from one of the receiver modules. When one of them requests for a new spool file, the \name{queue-in} module opens one and returns a positive result. The receiver module then sends the envelope and message, which is written into the spool file by \name{queue-in}. If all went well, another positive result is returned.
    2.30 +The \name{queue-in} module creates new spool files in the \name{incoming} queue and in the message \name{pool} for incoming messages. It is a process running in background, waiting for connections from one of the receiver modules. When one of them is receiving a new message, it connects to the \name{queue-in} module which creates a spool file in the \name{incoming} queue and a message body file in the \name{pool} and returns success. The receiver module then sends the envelope, the message header, and the message body. The first two get written into the spool file by \name{queue-in}, the latter is stored into the \name{pool}. If all went well another positive result is returned.
    2.31  %fixme: should be no daemon
    2.32  
    2.33  
    2.34 -The \name{scanning} module is the central part of the system. It takes spooled messages from the \name{incoming} queue, works on them, and writes them to the \name{outgoing} queue afterwards (the message is then removed from the \name{incoming} queue, of course). The main job is the processing done on the message. Headers are fixed and missing ones are added if necessary, aliasing is done, and external processing of any kind is triggered. The \name{scanning} module can run in background and look for new mail in regular intervals or signals may be sent to it by \name{queue-in}. Alternatively it can be called by \name{cron}, for example, to do single runs.
    2.35 +The \name{scanning} module is the central part of the system. It reads spooled messages from the \name{incoming} queue, works on the data, and writes new spool files to the \name{outgoing} queue. Then the message is removed from the \name{incoming} queue. The main job of this module is the processing of the message. Headers are fixed and missing ones are added if necessary, aliasing is done, and external processing of any kind is triggered. The \name{scanning} module can run in background and look for new mail in regular intervals or signals may be sent to it by \name{queue-in}. Alternatively it can be called by \name{cron} to do single runs. The \name{scanning} module work on the spool files primary but may read the mail body from the \name{pool} if necessary.
    2.36  
    2.37  
    2.38 -The \name{queue-out} module takes messages from the \name{outgoing} queue, queries information about the online state which specifies the route to use, creates envelopes for each recipient, and passes the messages to the correct transport module. Successfully transferred messages are removed from the \name{outgoing} queue. This module includes some tasks specific to \masqmail.
    2.39 +The \name{queue-out} module takes messages from the \name{outgoing} queue, queries information about the online state which specifies the route to use, and passes the messages to the correct transport module. Successfully transferred messages are removed from the \name{outgoing} queue. This module handles the \masqmail\ specific task of the route management.
    2.40  
    2.41  
    2.42 -The \name{incoming} queue stores messages received via one of the incoming channels. The messages are in unprocessed form; only envelope data is prepended.
    2.43 +The \name{incoming} queue stores envelope and the message header of messages received via one of the incoming channels. The data is in unprocessed form.
    2.44  
    2.45 +The \name{outgoing} queue contains processed data. The header and envelope information is complete and in valid form.
    2.46  
    2.47 -The \name{outgoing} queue contains processed messages. The header and envelope information is complete and in valid form.
    2.48 +The \name{pool} is the storage of the message bodies of queued messages. This data is not changed within the \MTA, it is written on reception and read on dispatch.
    2.49  
    2.50 -\name{Receiver modules} are the communication interface between outside senders and the \name{queue-in} module. Each protocol needs a corresponding \name{receiver module} to be supported. Most popular are the \name{sendmail} module (which is a command to be called from the local host) and the \name{smtpd} module (which listens on port 25). Other modules to support other protocols may be added as needed.
    2.51 -%fixme: get invoked by inetd, or better ucspi-tcp (by bernstein) which can limit max number of concurrent connections. and includes tcp-wrappers functionality.
    2.52  
    2.53 +\name{Receiver modules} are the communication interface between external senders and the \name{queue-in} module. Each protocol needs a corresponding \name{receiver module} to be supported. Most popular are the \name{sendmail} module which is a command to be called from the local host and the \name{smtpd} module which listens on port 25. Other modules to support other protocols may be added as needed. Receiving modules that need to listen on ports should get invoked by \name{inetd} or a more secure replacement like \person{Bernstein}'s \name{ucspi-tcp}. This makes it possible to run them with least privilege.
    2.54  
    2.55 -\name{Transport modules}, on the opposite side of the system, are the modules to send outgoing mail; they are the interface between \name{queue-out} and remote hosts or local commands for further processing. The most popular ones are the \name{smtp} module (which acts as the \SMTP\ client) and the \name{pipe} module (to interface gateways to other systems or networks, like fax or uucp). A module for local delivery is not included, \masqmail\ passes this job to the \NAME{MDA} (see section \ref{sec:functional-requirements} for reasons). Thus a \name{mail delivery agent} (like \name{procmail}) is to be used with the \name{pipe} module.
    2.56 +\name{Transport modules}, on the opposite side of the system, are the modules to send outgoing mail. They are the interface between \name{queue-out} and remote hosts or local commands for further processing. The most popular ones are the \name{smtp} module which acts as the \SMTP\ client and the \name{pipe} module to interface gateways to other systems or networks, like fax or \NAME{UUCP}. A module for local delivery is not included, \masqmail\ passes this job to the \NAME{MDA} (see section \ref{sec:functional-requirements} for reasons). Thus a \name{mail delivery agent} (like \name{procmail}) is to be used with the \name{pipe} module.
    2.57  
    2.58  
    2.59  
    2.60 @@ -397,55 +397,36 @@
    2.61  The client indicates the end of each data transfer with a special terminator sequence. The appearance of this terminator sequence tells the server process that the data transfer is complete and makes the server send a reply. The server process takes responsibility of the data in sending a success reply. A failure reply immediately stops the dialog and resets both client and server to the state before the connection attempt.
    2.62  
    2.63  \paragraph{Syntax}
    2.64 -Data transfer is done by sending plain text data. \name{Line Feed}---the native line separator on \unix---is used as line separator. The terminator sequence used to indicate the end of the data transfer is the \NAME{ASCII} \name{null} character (``\texttt{\textbackslash0}''). Replies are one-digit numbers with \texttt{0} meaning success and any other number (\texttt{1}--\texttt{9}) indicate failure.
    2.65 +Data transfer is done by sending plain text data. \name{Line Feed}---the native line separator on \unix---is used as line separator. The terminator sequence used to indicate the end of the data transfer is the \NAME{ASCII} \name{null} character (`\texttt{\textbackslash0}'). Replies are one-digit numbers with `\texttt{0}' meaning success and any other number (`\texttt{1}'--`\texttt{9}') indicate failure.
    2.66  
    2.67  
    2.68  
    2.69 -\subsubsection*{Spool file format}
    2.70 +\subsubsection*{The queue}
    2.71  
    2.72 -The spool file format is basically the same as the one in current \masqmail: one file for the message body, the other for envelope and header information. The data file is stored in a separate data pool. It is written by \name{queue-in}, \name{scanning} can read it if necessary, \name{queue-out} reads it to generate the outgoing message, and deletes it after successful transfer. The header file (including the envelope) is written into the \name{incoming} queue. The \name{scanning} modules reads it, processes it, and writes a modified copy into the \name{outgoing} queue; the file in \name{incoming} is deleted then. \name{queue-out} finally takes the header file from \name{outgoing} to generate the resulting message. This data flow is shown in figure \ref{fig:queue-data-flow}.
    2.73 +The queue consists of three directories within the queue path. Two, named \name{incoming} and \name{outgoing}, for storing the spool files; one, called \name{pool}, to store the data files. The files being part of one message share the same unique name. The spool file's internal structure can remain the same as the one of current \masqmail. A queued message is represented by a spool file in \name{incoming} or \name{outgoing} and a data file in the \name{pool}.
    2.74  
    2.75 -\begin{figure}
    2.76 -	\begin{center}
    2.77 -		%\input{img/queue-data-flow.eps}
    2.78 -	\end{center}
    2.79 -	\caption{Data flow of messages in the queue}
    2.80 -	\label{fig:queue-data-flow}
    2.81 -\end{figure}
    2.82 +The spool file owner's executable bit shows if the file is ready for further processing: The module that writes the file into the queue sets the bit as last action. Modules that read from the queue can process messages that have the bit set. This approach is derived from \postfix.
    2.83  
    2.84 -The queue consists of three directories within the queue path. Two, named \name{incoming} and \name{outgoing}, for storing the header files; one, called \name{pool}, to store the message bodies. The files being part of one message share the same unique name. The header files internal structure can be the same as the one of current \masqmail.
    2.85  
    2.86 -Messages in queues are a header file in \name{incoming} or \name{outgoing} and a data file in \name{pool}. The header file owner's executable bit indicates if the file is ready for further processing: the module that writes the file into the queue sets the bit as last action. Modules that read from the queue can process messages with the bit set.
    2.87 +The spool file format is basically the same as the one in current \masqmail: one file for the envelope and message header information (it is called ``spool file'' here), a second file for the message body (called ``data file'').
    2.88  
    2.89 -No spool files are modified after they are written to disk. Modifications to header files can be made by the \name{scanning} module in the ``move'' from \name{incoming} to \name{outgoing}---it is a create and remove, actually. Further rewriting can happen in \name{queue-out}, as well without altering the file.
    2.90 +The data file is stored in a separate data pool. It is written by \name{queue-in}; \name{scanning} can read it if necessary; \name{queue-out} reads it to generate the outgoing message and deletes it after successful transfer. Data files do not change at all within the system. They are written in default local plain text format. Required translation is done in the receiver and transport modules.
    2.91  
    2.92 -Data files do not change at all within the system. They are written in default local plain text format. Required translation is done in the receiver and transport modules.
    2.93 -%fixme: why plain text and not db? -> simplicity
    2.94 +The spool file is written into the \name{incoming} queue. The \name{scanning} modules reads it, processes it, and writes a new one into the \name{outgoing} queue; the file in \name{incoming} is deleted then. \name{queue-out} finally takes the spool file from \name{outgoing} and the data file from the \name{pool} to generate the resulting message.
    2.95  
    2.96 -Mark spooled mail messages when processing of the writing module is finished: Either by setting the executable bit (like \postfix\ does), or by changing the owner (an approach for multiple masqmail users).
    2.97 +%This data flow is shown in figure \ref{fig:queue-data-flow}.
    2.98 +%
    2.99 +%\begin{figure}
   2.100 +%	\begin{center}
   2.101 +%		%\input{img/queue-data-flow.eps}
   2.102 +%	\end{center}
   2.103 +%	\caption{Data flow of messages in the queue}
   2.104 +%	\label{fig:queue-data-flow}
   2.105 +%\end{figure}
   2.106  
   2.107  
   2.108 -A sample header file. With comments in parenthesis.
   2.109 -
   2.110 -\begin{quote}\footnotesize
   2.111 -\begin{verbatim}
   2.112 -1LGtYh-0ut-00                (backup copy of the file name)
   2.113 -MF:<meillo@dream>            (envelope: sender)
   2.114 -RT: <user@example.org>       (envelope: recipient)
   2.115 -PR:local                     (meta info: protocol)
   2.116 -ID:meillo                    (meta info: id/user/ip)
   2.117 -DS: 18                       (meta info: size)
   2.118 -TR: 1230462707               (meta info: timestamp)
   2.119 -                             (following: headers)
   2.120 -HD:Received: from meillo by dream with local (masqmail 0.2.21) id
   2.121 - 1LGtYh-0ut-00 for <user@example.org>; Sun, 28 Dec 2008 12:11:47 +0100
   2.122 -HD:To: user@example.org
   2.123 -HD:Subject: test mail
   2.124 -HD:From: <meillo@dream>
   2.125 -HD:Date: Sun, 28 Dec 2008 12:11:47 +0100
   2.126 -HD:Message-ID: <1LGtYh-0ut-00@dream>
   2.127 -\end{verbatim}
   2.128 -\end{quote}
   2.129 +A sample spool file. With comments in parenthesis.
   2.130 +\codeinput{input/sample-spool-file.txt}
   2.131  
   2.132  
   2.133