docs/diploma

diff thesis/tex/4-MasqmailsFuture.tex @ 177:7781ad0811f7

new sections, moved content, and more
author meillo@marmaro.de
date Fri, 26 Dec 2008 22:38:07 +0100
parents c51f1be54224
children b426a663d5f0
line diff
     1.1 --- a/thesis/tex/4-MasqmailsFuture.tex	Fri Dec 26 16:37:44 2008 +0100
     1.2 +++ b/thesis/tex/4-MasqmailsFuture.tex	Fri Dec 26 22:38:07 2008 +0100
     1.3 @@ -20,7 +20,6 @@
     1.4  Additional to the \mta\ job, \masqmail\ also offers mail retrieval services with being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online route.
     1.5  
     1.6  
     1.7 -
     1.8  \subsubsection*{The code}
     1.9  
    1.10  \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library however has higher precedence in linking.
    1.11 @@ -37,34 +36,34 @@
    1.12  
    1.13  
    1.14  
    1.15 -\section{\masqmail\ next generation}
    1.16 +\section{Requirements}
    1.17  
    1.18 -\subsection{Requirements}
    1.19 +This section identifies the requirements for future version of \masqmail. Most of them will apply to modern \MTA{}s in general.
    1.20 +
    1.21 +\subsection{General requirements}
    1.22  
    1.23  Following is a list of current and future requirements to make \masqmail\ ready for the future.
    1.24  
    1.25  
    1.26 -\subsubsection*{Large message handling}
    1.27 -Trends in the market for electronic communication go towards consolidated communication, hence email will be used more to transfer voice and video messages. This leads to larger messages. The store-and-forward transport of email is not good suited for large data. Thus new protocols, like \NAME{QMTP} (described in section %\ref{FIXME}
    1.28 -), may become popular.
    1.29 +\subsubsection*{Security}
    1.30 +\MTA{}s are critical points for computer security, as they are accessable from external networks. They must be secured with high effort. Properties like high priviledge level, work load influenced from extern, work on unsafe data, and demand for reliability, increase the security needed. Unsecure and unreliable \mta{}s are of no value. \masqmail\ needs to b e secure enough for its target field of operation.
    1.31 +
    1.32 +
    1.33 +\subsubsection*{Reliability}
    1.34 +<< crash only software >>
    1.35 +
    1.36 +<< dont lose mail >>
    1.37 +
    1.38 +
    1.39 +\subsubsection*{Extendability}
    1.40 +Modern needs like large messages demand for more efficient mail transport through the net. Aswell is a final solution needed to defeat the spam problem. New mail transport protocols seem to be the only good solutions for both problems. They also can improve reliability, authentication, and verification issues. \masqmail\ should be able to support new mail transfer protocols as they appear and are used.
    1.41 +%fixme: like old sendmail, but not too much like it
    1.42  
    1.43  
    1.44  \subsubsection*{Ressource friendly software}
    1.45  The merge of communication hardware and the move of email services from providers to homes, demands smaller and more resource-friendly software. The amount of mail will be lower, even if much more mail will be sent. More important will be the energy consumption and heat emission. These topics increased in relevance during the past years and they are expected to become more central. \masqmail\ is not a program to be used on large servers, but to be used on small devices. Thus focusing on energy and heat, not on performance, is the direction to go.
    1.46  
    1.47  
    1.48 -\subsubsection*{New mail transfer protocols}
    1.49 -Large messages demand more efficient transport through the net. As well is a final solution needed to defeat the spam problem. New mail transport protocols may be the only good solutions for both problems. They also can improve reliability, authentication, and verification issues. \masqmail\ should be able to support new protocols as they appear and are used.
    1.50 -
    1.51 -
    1.52 -\subsubsection*{Spam handling}
    1.53 -Spam is a major threat. According to the \NAME{SWOT} analysis, the goal is to reduce it to a bearable level. Spam fighting is a war are where the good guys tend to lose. Putting too much effort there will result in few gain. Real success will only be possible with new---better---protocols and abandonning the weak legacy technologies. Hence \masqmail\ should be able to provide state-of-the-art spam protection, but not more.
    1.54 -
    1.55 -
    1.56 -\subsubsection*{Security}
    1.57 -\MTA{}s are critical points for computer security, as they are accessable from external networks. They must be secured with high effort. Properties like high priviledge level, work load influenced from extern, work on unsafe data, and demand for reliability, increase the security needed. Unsecure and unreliable \mta{}s are of no value. \masqmail\ needs to b e secure enough for its target field of operation.
    1.58 -
    1.59 -
    1.60  \subsubsection*{Easy configuration}
    1.61  Having \mta{}s on many home servers and clients, requires easy and standardized configuration. The common setups should be configurable with single actions by the user. Complex configuration should be possible, but focused must be the most common form of configuration: choosing one of several standard setups.
    1.62  
    1.63 @@ -118,6 +117,24 @@
    1.64  
    1.65  
    1.66  
    1.67 +
    1.68 +
    1.69 +http://fanf.livejournal.com/50917.html %how not to design an mta - the sendmail command
    1.70 +http://fanf.livejournal.com/51349.html %how not to design an mta - partitioning for security
    1.71 +http://fanf.livejournal.com/61132.html %how not to design an mta - local delivery
    1.72 +http://fanf.livejournal.com/64941.html %how not to design an mta - spool file format
    1.73 +http://fanf.livejournal.com/65203.html %how not to design an mta - spool file logistics
    1.74 +http://fanf.livejournal.com/65911.html %how not to design an mta -   more about log-structured MTA queues
    1.75 +http://fanf.livejournal.com/67297.html %how not to design an mta -   more log-structured MTA queues
    1.76 +http://fanf.livejournal.com/70432.html %how not to design an mta - address verification
    1.77 +http://fanf.livejournal.com/72258.html %how not to design an mta - content scanning
    1.78 +
    1.79 +
    1.80 +
    1.81 +
    1.82 +
    1.83 +
    1.84 +
    1.85  \subsection{Jobs of an MTA}
    1.86  
    1.87  This section tries to identify the needed modules for a modern \MTA. They are later the pieces of which the new architecture is built of.
    1.88 @@ -286,11 +303,15 @@
    1.89  
    1.90  \subsubsection*{Spam prevention}
    1.91  
    1.92 +---
    1.93 +Spam is a major threat nowadays and the goal is to reduce it to a bearable level (see section \ref{sec:swot-analysis}). Spam fighting is a war are where the good guys tend to lose. Putting too much effort there will result in few gain. Real success will only be possible with new---better---protocols and abandonning the weak legacy technologies. Hence \masqmail\ should be able to provide state-of-the-art spam protection, but not more.
    1.94 +---
    1.95 +
    1.96  Spam is a major threat to email, as described in section \ref{sec:swot-analysis}. The two main problems are forgable sender addresses and that it is cheap to send hundreds of thousands of messages. Hence, spam senders can operate in disguise and have minimal cost.
    1.97  
    1.98  As spam is not just a nuisance for end users, but also for the infrastructure---the \mta{}s---by increasing the amount of mail messages, \MTA{}s need to protect themself. Two approaches are used.
    1.99  
   1.100 -First refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and reciptient mail addresses would be enough, but as they are forgable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time can be used, during the \SMTP\ dialog, for checking if a message seems to be spam. The advantage is that acceptance of bad messages can be simply refused---no responsibility for the message is takes and no further system load is added.
   1.101 +First refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and reciptient mail addresses would be enough, but as they are forgable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time can be used, during the \SMTP\ dialog, for checking if a message seems to be spam. The advantage is that acceptance of bad messages can be simply refused---no responsibility for the message is takes and no further system load is added. See \RFC2505 (especially section 1.5) for detail.
   1.102  
   1.103  Second checking for spam after the mail was accepted and queued. Here more processing time can be invested, so more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existent header lines. Thus all further work on the message is the same as for non-spam messages.
   1.104  
   1.105 @@ -305,23 +326,26 @@
   1.106  
   1.107  Related to spam is malicous content (short: \name{malware}) like viruses, worms, trojan horses. They, in contrast to spam, do not affect the \MTA\ itself, as they are in the mail body. The same situation in the real world is post offices opening letters to check if they contain something that could harm the recipient. This is not a mail transport concern. Apart of not being the right program to do the job, the \MTA\---the one which is responsible for the recipient---is at a good position to do this work.
   1.108  
   1.109 -In any way should malware checking be done by external programs that may be invoked by the \mta. But using mail deliver and processing agents, like \name{procmail}, appear to be better suited locations to invoke content scanners.
   1.110 +In any way should malware checking be done by external programs that may be invoked by the \mta. But using mail deliver and processing agents, like \name{procmail}, seem to be better suited locations to invoke content scanners.
   1.111  
   1.112 +A popular email filter framework is \name{amavis} which integrates various spam and virus scanners. The common setup includes a receiving \MTA\ which sends it to \name{amavis} using \SMTP, \name{amavis} processes the mail and sends it then to a second \MTA\ that does the outgoing transfer. \postfix\ and \exim\ can be configured so that one instance can work as both, the \MTA\ for incoming and outgoing transfer. A setup with \sendmail\ needs two separate instances running. It must be quarateed that all mail flows through the scanner.
   1.113  
   1.114 +A future \masqmail\ would do good to have a single point, where all traffic flows through, that is able to invoke external programs to do mail processing of any kind.
   1.115  
   1.116 -AMaViS (amavisd-new): email filter framework to integrate spam and virus scanner
   1.117 -\begin{verbatim}
   1.118 -internet -->25 MTA -->10024 amavis -->10025 MTA --> reciptient
   1.119 -                |                            |
   1.120 -                +----------------------------+
   1.121 -\end{verbatim}
   1.122  
   1.123 -postfix and exim can habe both mta servises in the same instance, sendmail needs two instances running.
   1.124 -
   1.125 -MailScanner:
   1.126 -incoming queue --> MailScanner --> outgoing queue
   1.127 -
   1.128 -postfix: with one instance possible, exim and sendmail need two instances running
   1.129 +%AMaViS (amavisd-new): email filter framework to integrate spam and virus scanner
   1.130 +%\begin{verbatim}
   1.131 +%internet -->25 MTA -->10024 amavis -->10025 MTA --> reciptient
   1.132 +                %|                            |
   1.133 +                %+----------------------------+
   1.134 +%\end{verbatim}
   1.135 +%
   1.136 +%postfix and exim can habe both mta servises in the same instance, sendmail needs two instances running.
   1.137 +%
   1.138 +%MailScanner:
   1.139 +%incoming queue --> MailScanner --> outgoing queue
   1.140 +%
   1.141 +%postfix: with one instance possible, exim and sendmail need two instances running
   1.142  
   1.143  
   1.144  %message body <-> envelope, header
   1.145 @@ -348,43 +372,59 @@
   1.146  
   1.147  \subsubsection*{Archiving}
   1.148  
   1.149 +Mail archiving and auditability become more important as electronic mail becomes more important. Ability to archive verbatim copies of every mail coming into and every mail going out of the system, with relation between them, appears to be a goal to achieve.
   1.150  
   1.151 -\texttt{always\_bcc} feature of postfix
   1.152 +\postfix\ for example has a \texttt{always\_bcc} feature, to send a copy of every mail to a definable reciptient. At least this funtionality should be given, although a more complete approach is preferable.
   1.153  
   1.154  
   1.155  
   1.156 -\section{A new architecture}
   1.157 +\section{Merging the parts}
   1.158  
   1.159 +The last sections identified the jobs that need to be done by a modern \MTA; problems and prefered choices were mentioned too. Now the various jobs are assigned to modules, of which an architecture is created. It is inpired by existing ones and driven by the identified jobs and requirements.
   1.160  
   1.161 -(ssl)
   1.162 --> msg-in (local or remote protocol handlers)
   1.163 --> spam-filter (and more)
   1.164 --> queue
   1.165 --> msg-out (local-delivery by MDA, or remote-protocol-handlers)
   1.166 -(ssl)
   1.167 +One major design idea of the design were:
   1.168 +\begin{itemize}
   1.169 +	\item free the internal system from in and out channels
   1.170 +	\item arbitrary protocol handlers have to be addable afterwards
   1.171 +	\item a single facility for scanning (all mail goes through it)
   1.172 +	\item concentrate on mail transfer
   1.173 +\end{itemize}
   1.174  
   1.175 +The result is a symetric design, featuring any number of handlers for incoming connections to receive mail and pass it to the module that stores it into the incoming queue. A central scanning module take mail from the incoming queue, processes it in various ways and puts it afterwards into the outgoing queue. Another module takes it out there and passes it to a matching transport module that transfers it to the destination. In other words, three main modules (queue-in, scanning, queue-out) are connected by the two queues (incoming, outgoing); on each end are more modules to receive and send mail---for each protocol one. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
   1.176  
   1.177 +\begin{figure}
   1.178 +	\begin{center}
   1.179 +		\input{input/masqmail-arch-new.tex}
   1.180 +	\end{center}
   1.181 +	\caption{A new designed architecture for \masqmail}
   1.182 +	\label{fig:masqmail-arch-new}
   1.183 +\end{figure}
   1.184  
   1.185  
   1.186  
   1.187 -http://fanf.livejournal.com/50917.html %how not to design an mta - the sendmail command
   1.188 -http://fanf.livejournal.com/51349.html %how not to design an mta - partitioning for security
   1.189 -http://fanf.livejournal.com/61132.html %how not to design an mta - local delivery
   1.190 -http://fanf.livejournal.com/64941.html %how not to design an mta - spool file format
   1.191 -http://fanf.livejournal.com/65203.html %how not to design an mta - spool file logistics
   1.192 -http://fanf.livejournal.com/65911.html %how not to design an mta -   more about log-structured MTA queues
   1.193 -http://fanf.livejournal.com/67297.html %how not to design an mta -   more log-structured MTA queues
   1.194 -http://fanf.livejournal.com/70432.html %how not to design an mta - address verification
   1.195 -http://fanf.livejournal.com/72258.html %how not to design an mta - content scanning
   1.196 +\subsection{Modules and queues}
   1.197  
   1.198 +\subsubsection*{queue-in}
   1.199 +\subsubsection*{scanning}
   1.200 +\subsubsection*{queue-out}
   1.201  
   1.202 +\subsubsection*{incoming queue}
   1.203 +\subsubsection*{outgoing queue}
   1.204  
   1.205 +\subsubsection*{receiver modules}
   1.206 +\subsubsection*{transport modules}
   1.207  
   1.208  
   1.209  
   1.210 +\subsection{Intermodule communication}
   1.211  
   1.212  
   1.213  
   1.214 +\subsection{Spool file format}
   1.215 +
   1.216 +
   1.217 +\subsection{Rights and permission}
   1.218 +
   1.219  
   1.220  
   1.221  
   1.222 @@ -401,9 +441,6 @@
   1.223  This section discusses about what shapes \masqmail\ could have---which directions the development could go to.
   1.224  
   1.225  
   1.226 -
   1.227 -
   1.228 -
   1.229  \subsubsection*{\masqmail\ in five years}
   1.230  
   1.231  Now how could \masqmail\ be like in, say, five years?
   1.232 @@ -418,17 +455,9 @@
   1.233  
   1.234  ---
   1.235  
   1.236 -<< plans to get masqmail more popular again (if that is the goal) >>
   1.237  
   1.238 -<< More users >>
   1.239  
   1.240 -
   1.241 -
   1.242 -
   1.243 -
   1.244 -
   1.245 -
   1.246 -\section{Work to do}
   1.247 +\subsubsection*{Work to do}
   1.248  
   1.249  << short term goals --- long term goals >>
   1.250