Mercurial > docs > diploma

--- a/thesis/tex/4-MasqmailsFuture.tex	Fri Dec 19 23:35:38 2008 +0100
+++ b/thesis/tex/4-MasqmailsFuture.tex	Sat Dec 20 20:37:09 2008 +0100
@@ -37,8 +37,9 @@


+\section{\masqmail\ next generation}

-\section{Requirements}
+\subsection{Requirements}

 Following is a list of current and future requirements to make \masqmail\ ready for the future.

@@ -72,7 +73,7 @@


-\section{Discussion on architecture}
+\subsection{Discussion on architecture}

 A program's architecture is probably the most influencing design decision, and has the greatest impact on the program's future capabilities. %fixme: search quote ... check if good

@@ -117,86 +118,127 @@


-\subsection{Modules needed}
+\subsection{Jobs of an MTA}

 This section tries to identify the needed modules for a modern \MTA. They are later the pieces of which the new architecture is built of.

-
-\subsubsection*{The simple view}
+The basic job of a \mta\ is to tranport mail from a sender to a recipient. This is the definition of such a program and this is how \person{Dent}\cite[page 19]{dent04} and \person{Hafiz} \cite[pages 3-5]{hafiz05} generally see its design.

-The basic job of a \mta\ is to tranport mail from a sender to a recipient. This is the definition of such a program and this is how \person{Dent}\cite[page 19]{dent04} and \person{Hafiz} \cite[pages 3-5]{hafiz05} start on the design.
-
-An \MTA\ therefor needs at least a mail receiving facility and a mail sending facility. But both, and probably all \MTA\ developers (excluded the only forwarders), see the need for a mail queue. A mail queue removed the need to deliver at once. They also provide fail-safe storage of mails until they are delivered.
-
+An \MTA\ therefor needs at least a mail receiving facility and a mail sending facility. Additionally probably all \MTA\ developers (excluded the only forwarders), see the need for a mail queue. A mail queue removes the need to deliver at once a message is received. They also provide fail-safe storage of mails until they are delivered.


 \subsubsection*{Incoming channels}

-The second addition \person{Hafiz} made is the split of incoming and outgoing channels into local and remote. The question is, if this is nessesary. It is the way, it was done for a long time, but is this extra complexity needed?
-
-The common situation is incoming mail on port 25 using \SMTP\ and via the \texttt{sendmail} command. Outgoing mail is either sent using \SMTP, piped into local commands (for example \texttt{uucp}), or delivered locally by appending to a mailbox.
+\sendmail-compatible \mta{}s must support at least two incoming channels: mail submitted using the \sendmail\ command, and mail received via the \SMTP\ daemon. It is therefor common to split the incoming channel into local and remote. This is done by \qmail\ and \postfix. The same way is \person{Hafiz}'s view.

-The \MTA's architecture would be simpler if some of these channels could be merged. The reason is, if various modules do similar jobs, common things might need to be duplicated. On the other side is it better to have more independent modules if each one is simpler then.
+In contrast is \name{sendmail X}: Its locally submitted messages go to the \SMTP\ daemon, which is the only connection towards the mail queue. %fixme: is it a smtp dialog? or a second door?
+\person{fanf} proposes a similar approach. He wants the \texttt{sendmail} command to be a simple \SMTP\ client that contacts the \SMTP\ daemon of the \MTA\ like it is done by connections from remote. The advantage here is one single module where all \SMTP\ dialog with submitters is done. Hence one single point to accept or refuse incoming mail. Additionally does the module to put mail into the queue not need to be \name{setuid} or \name{setgid} because it is only invoked from the \SMTP\ daemon. The \MTA's architecture would become simpler and common tasks are not duplicated in modules that do similar jobs.

-\qmail\ uses \name{qmail-inject} (local message in) and \name{qmail-smtpd} (remote message in), which both handle messages over to \name{qmail-queue} that puts it into the mail queue. \postfix's approach is similar. \name{sendmail X} %fixme: what about meta1 here?
-used only \NAME{SMTPS}, which is for receiving mail from remote, to communicate with the queue manager \NAME{QMGR}. Mail from local goes over \NAME{SMTPS}.
+But merging the input channels in the \SMTP\ daemon makes the \MTA\ heavily dependent on \SMTP\ being the main mail transfer protocol. To \qmail\ and \postfix\ new modules to support other ways of message receival may be added without change of other parts of the system. Also is it better to have more independent modules if each one is simpler then.

-The \name{sendmail X} approach seems to be the simpler one, but does heavily rely on \SMTP\ being the main mail transfer protocol. To \qmail\ and \postfix\ new modules may be added to support other ways of message receival, without any change of other parts of the system.
+With the increasing need for new protocols in mind, it seems better to have single modules for each incoming channel, although this leads to duplicated acceptance checks.


 \subsubsection*{Outgoing channels}

+Outgoing mail is commonly either sent using \SMTP, piped into local commands (for example \texttt{uucp}), or delivered locally by appending to a mailbox.
+
 Outgoing channels are similar for \qmail, \postfix, and \name{sendmail X}: All of them have a module to send mail using \SMTP, and one for writing into a local mailbox. Local mail delivery is a job that requires root priveledge to be able to switch to any user in order to write to his mailbox. Modular \MTA{}s do not need \name{setuid root}, but the local delivery process (or its parent) needs to run as root.

-As mail delivery to local users, is \emph{not} included in the basic job of \MTA{}s, why should they care about it? In order to keep the system simple and to have programs do one job well, the local delivery job should be handed over to \NAME{MDA}s. \name{Mail delivery agents} are the tools that are specialized for local delivery. They know about the various mailbox formats and are aware of the problems of concurrent write access and thelike. Hence handling the message and the responsiblity for it over to a mail delivery agent, like \name{procmail} or \name{maildrop}, seems to be the right way to go.
+As mail delivery to local users, is \emph{not} included in the basic job of an \MTA{}, why should it care about it? In order to keep the system simple and to have programs that do one job well, the local delivery job should be handed over to a specialist: the \name{mail delivery agent}. \NAME{MDA}s know about the various mailbox formats and are aware of the problems of concurrent write access and thelike. Hence handling the message and the responsiblity over to a \NAME{MDA}, like \name{procmail} or \name{maildrop}, seems to be the right way to go.
+
+This means an outgoing connection that pipes mail into local commands is required. Other outgoing channels, one for each supportet protocol, may be designed like it was done in other \MTA{}s.
+
+
+
+\subsubsection*{Mail queue}

-This means outgoing connections, piping mails into local commands needs to be implemented.
+Mail queues are probably used in all \mta{}s, excluding the simple forwarders. A mail queue is a essential requirement for \masqmail, as it is to be used for non-permanent online connections. This means, mail must be queued until a online connection is available to send the message.
+
+The mail queue and the module to manage it are the central part of the whole system. This demands especially for robustness and reliability, as a failure here can lead to loosing mail. An \MTA\ takes over responsibility for mail in accepting it, hence loosing mail messages is absolutely to avoid. This covers any kind of crash situation too. The worst thing acceptable to happen is a mail to be sent twice.
+
+\sendmail, \exim, \qmail, \name{sendmail X}, and \masqmail\ feature one single mail queue. \postfix\ has three of them: \name{incoming}, \name{active}, and \name{deferred}. (The \name{maildrop} queue is excluded, as it is only used for the \texttt{sendmail} command.)
+
+\MTA\ setups that include content scanning tend to require two separate queues. To use \sendmail\ in such setups requires two independent instances, with two separate queues, running. \exim\ can handle it with special \name{router} and \name{transport} rules, but the data flow gets complicated. Hence an idea is to use two queues, \name{incoming} and \name{active} in \postfix's terminology, with the content scanning within the move from \name{incoming} to \name{active}.


 \subsubsection*{Sanitize mail}
-generate valid headers: add, rewrite
-... better before inserting into the queue
+
+Mail coming into the system often lacks important header lines. At least the required ones must be added from the \MTA. A good example is the \texttt{Message-Id:} header.
+
+In \postfix, this is done by the \name{cleanup} module, which invokes \name{rewrite}. The position in the message flow is after coming from one of the several incoming channels and before the message is stored into the \name{incoming} queue. Modules that handle incoming channels may also add headers, for example the \texttt{From:} and \texttt{Date:} headers. \name{cleanup}, however, does a complete check to make the mail header complete and valid.
+
+Apart from deciding where to sanitize the mail header, is the question where to generate the envelope. The envelope specifies the actual recipient of the mail, no matter what the \texttt{To:}, \texttt{Cc:}, and \texttt{Bcc:} headers tell. Multiple reciptients lead to multiple different envelopes, containing all the same mail message.
+

-(determine the method to send at that position?)
+
+\subsubsection*{Choose route to use}

+One key feature of \masqmail\ is its ability to send mail out in different ways. The decision is based on the current online state and whether a route may be used for a message or not. The online state can be retrieved in tree ways, explained in \ref{sec:fixme}. A route to send is found by checking every available route for being able to transfer the current message, until one matches.
+
+This functionality should be implemented in the module that is responsible to invoke one of the outgoing channel modules (for example the one for \SMTP\ or the pipe module).
+
+\masqmail\ can rewrite the envelope's from address and the \texttt{From:} header, dependent on the outgoing route to use. This rewrite must be done \emph{after} it is clear which route a mail will take, of course, so this may be not the module where other header editing is done.


 \subsubsection*{Aliasing}

-where to expand aliases?
-
-
-
-\subsubsection*{Mail queue}
+Where should aliases get expanded? They appear in different kind. Important are the ones available in the \path{aliases} file. Aliases can be:
+\begin{itemize}
+	\item a different local user (e.g.\ ``\texttt{bob: alice}'')
+	\item a remote user (e.g.\ ``\texttt{bob: john@example.com}'')
+	\item a list of users (e.g.\ ``\texttt{bob: alice, john@example.com}'')
+	\item a command (e.g.\ ``\texttt{bob: |foo}'')
+\end{itemize}
+Aliases can be cascaded like in the following example:
+\begin{verbatim}
+team:   alice, bob
+bob:    bob@example.com
+\end{verbatim}

-Mail queues are probably used in all \mta{}s, excluding the simple forwarders. A mail queue is a essential requirement for \masqmail, as it is to be used for non-permanent online connections.
-
+Addresses expanding to lists of users lead to more envelopes. Aliases changing the reciptients domain part may require a different route to use.

-
-
+Aliasing is often handled in expanding the alias and reinjecting the mail into the system. Unfortunately, the mail is processed twice using this approach; also does the system need to handle more mail this way. Is it needed to check the new recipient address for acceptance, or should it be accepted generally? (The aliase actually came from inside the system.) A second point for access control seems to be no choice.


 \subsubsection*{Authentication}

-either by
-- network/ip address
-	easiest: restricting by static IP addresses (Access control via hosts.allow/hosts.deny)
-or
-- some kind of auth (for dynamic remote hosts)
-	adds complexity
-	- SASL
-	- POP/IMAP: pop-before-smtp, DRAC, WHOSON
-	- TLS (certificates)
+One thing to avoid is being an \name{open relay}. Open relays allow to relay mail from everywhere to everywhere. This is a major source of spam. The solution is restricting relay\footnote{Relaying is passing mail, that is not from and not for the own system, through it.} access.
+
+Several ways to restrict access are available. The most simple one is restrictiction by the \NAME{IP} address. No extra complexity is added this way, but static \NAME{IP} addresses are mandatory. This kind of restriction may be enabled using the operating system's \path{hosts.allow} and \path{hosts.deny} files. To allow only connections to port 25 from localhost or the local network \texttt{192.168.100.0/24} insert the line ``\texttt{25: ALL}'' into \path{hosts.deny} and ``\texttt{25: 127.0.0.1, 192.168.100.}'' into \path{hosts.allow}.

+If static access restriction is not possible, for example if mail from locations with changing \NAME{IP} addresses wants to be accepted, some kind of authentication mechanism is required. Three common kinds exist:
+\begin{enumerate}
+\item \SMTP-after-\NAME{POP}: uses authenication on the \NAME{POP} protocol to permit incoming \SMTP\ connections for a limited time afterwards.
+\item \SMTP authentication: is an extension to \SMTP. Authentication can be requested before mail is accepted.
+\item Certificates: confirm the identity of someone.
+\end{enumerate}
+The first mechanism requires a \NAME{POP} (or \NAME{IMAP}) server running on the same host (or a trusted one), to enable the \SMTP\ server to use the login dates on the \NAME{POP} server. This is a common practice used by mail service providers, but is not adequate for the environments \masqmail\ is designed for.
+
+Certificate based authentication, like provided by \NAME{TLS}, suffers from the overhead of certificate management. But \NAME{TLS} provides encryption too, so is useful anyway.
+
+\SMTP\ authentication (also refered to as \NAME{SMTP-AUTH}) suppoert is easiest received by using a \name{Simple Authentication and Security Layer} implementation. \person{Dent} sees in \NAME{SASL} the best solution for authenticating dynamic users:
 \begin{quote}
-None of these add-ons is an ideal solution. They require additional code compiled into your existing daemons that may then require special write accesss to system files. They also require additional work for busy system administrators. If you cannot use any of the nonauthenticating alternatives mentioned earlier, or your business requirements demand that all of thyour users' mail pass through your system no matter where they are on the Internet, SASL is probably the solution that offers the most reliable and scalable method to authenticate users.
+%None of these add-ons is an ideal solution. They require additional code compiled into your existing daemons that may then require special write accesss to system files. They also require additional work for busy system administrators. If you cannot use any of the nonauthenticating alternatives mentioned earlier, or your business requirements demand that all of your users' mail pass through your system no matter where they are on the Internet, SASL is probably the solution that offers the most reliable and scalable method to authenticate users.
+None of these [authentication methods] is an ideal solution. They require additional code compiled into your existing daemons that may then require special write accesss to system files. They also require additional work for busy system administrators. If you cannot use any of the nonauthenticating alternatives mentioned earlier, or your business requirements demand that all of your users' mail pass through your system no matter where they are on the Internet, \NAME{SASL} is probably the solution that offers the most reliable and scalable method to authenticate users.
 \cite[page 44]{dent04}
 \end{quote}

+%either by
+%- network/ip address
+%	easiest: restricting by static IP addresses (Access control via hosts.allow/hosts.deny)
+%or
+%- some kind of auth (for dynamic remote hosts)
+%	adds complexity
+%	- SASL
+%	- POP/IMAP: pop-before-smtp, DRAC, WHOSON
+%	- TLS (certificates)
+
+

 \subsubsection*{Encryption}