comparison thesis/tex/4-MasqmailsFuture.tex @ 339:f9f925c5e2d1

added labels and few work at some places
author meillo@marmaro.de
date Mon, 26 Jan 2009 13:36:18 +0100
parents 99e368f07e9a
children f26d63dbb22b
comparison
equal deleted inserted replaced
338:5a4b3e22a684 339:f9f925c5e2d1
1 \chapter{\masqmail's present and future} 1 \chapter{\masqmail's present and future}
2 \label{chap:present-and-future}
2 3
3 This chapter identifies requirements for \masqmail\ which are compared against the current code to see what is already fulfilled and what is missing. Then the outstanding work is ordered by relevance and a list of tasks to do is created. The end of this chapter is the evaluation of the best development strategy to get the work done in order to achieve the requirements. 4 This chapter identifies requirements for \masqmail\ which are compared against the current code to see what is already fulfilled and what is missing. Then the outstanding work is ordered by relevance and a list of tasks to do is created. The end of this chapter is the evaluation of the best development strategy to get the work done in order to achieve the requirements.
4 5
5 6
6 \section{The goal} 7 \section{The goal}
36 %fixme: add ref 37 %fixme: add ref
37 The requirements are named ``\NAME{RF}'' for ``requirement, functional''. 38 The requirements are named ``\NAME{RF}'' for ``requirement, functional''.
38 39
39 40
40 \paragraph{\RF1: Incoming and outgoing channels} 41 \paragraph{\RF1: Incoming and outgoing channels}
42 \label{rf1}
41 \sendmail-compatible \mta{}s must support at least two incoming channels: mail submitted using the \sendmail\ command, and mail received on a \NAME{TCP} port. Thus it is common to split the incoming channels into local and remote. This is done by \qmail\ and \postfix. The same way is \person{Hafiz}'s view \cite{hafiz05}. 43 \sendmail-compatible \mta{}s must support at least two incoming channels: mail submitted using the \sendmail\ command, and mail received on a \NAME{TCP} port. Thus it is common to split the incoming channels into local and remote. This is done by \qmail\ and \postfix. The same way is \person{Hafiz}'s view \cite{hafiz05}.
42 44
43 \SMTP\ is the primary mail transport protocol today, but with the increasing need for new protocols (see section \ref{sec:what-will-be-important}) in mind, support for more than just \SMTP\ is good to have. New protocols will show up, maybe multiple protocols need to be supported then. This leads to multiple remote channels, one for each supported protocol as it was done in other \MTA{}s. Best would be interfaces to add further protocols as modules. 45 \SMTP\ is the primary mail transport protocol today, but with the increasing need for new protocols (see section \ref{sec:what-will-be-important}) in mind, support for more than just \SMTP\ is good to have. New protocols will show up, maybe multiple protocols need to be supported then. This leads to multiple remote channels, one for each supported protocol as it was done in other \MTA{}s. Best would be interfaces to add further protocols as modules.
44 46
45 47
64 66
65 67
66 68
67 69
68 \paragraph{\RF2: Mail queuing} 70 \paragraph{\RF2: Mail queuing}
71 \label{rf2}
69 Mail queuing removes the need to deliver instantly as a message is received. The queue provides fail-safe storage of mails until they are delivered. Mail queues are probably used in all \mta{}s, even in some simple forwarders. The mail queue is essential for \masqmail, as \masqmail\ is used for non-permanent online connections. This means, mail must be queued until a online connection is available to send the message. This may be after a reboot. Hence the mail queue must provide persistence. 72 Mail queuing removes the need to deliver instantly as a message is received. The queue provides fail-safe storage of mails until they are delivered. Mail queues are probably used in all \mta{}s, even in some simple forwarders. The mail queue is essential for \masqmail, as \masqmail\ is used for non-permanent online connections. This means, mail must be queued until a online connection is available to send the message. This may be after a reboot. Hence the mail queue must provide persistence.
70 73
71 The mail queue and the module(s) to manage it are the central part of the whole system. This demands especially for robustness and reliability, as a failure here can lead to loosing mail. An \MTA\ takes over responsibility for mail in accepting it, hence loosing mail messages is absolutely to avoid. This covers any kind of crash situation too. The worst thing acceptable to happen is an already sent mail to be sent again. 74 The mail queue and the module(s) to manage it are the central part of the whole system. This demands especially for robustness and reliability, as a failure here can lead to loosing mail. An \MTA\ takes over responsibility for mail in accepting it, hence loosing mail messages is absolutely to avoid. This covers any kind of crash situation too. The worst thing acceptable to happen is an already sent mail to be sent again.
72 75
73 76
74 77
75 78
76 \paragraph{\RF3: Header sanitizing} 79 \paragraph{\RF3: Header sanitizing}
80 \label{rf3}
77 Mail coming into the system often lacks important header lines. At least the required ones must be added by the \MTA. One example is the \texttt{Date:} header, another is the, not required but recommended, \texttt{Message-ID:} header. Apart from adding missing headers, rewriting headers is important too. Changing the locally known domain part of email addresses to globally known ones is an example. \masqmail\ needs to be able to rewrite the domain part dependent on the route used to send the message, to prevent messages to get classified as spam. 81 Mail coming into the system often lacks important header lines. At least the required ones must be added by the \MTA. One example is the \texttt{Date:} header, another is the, not required but recommended, \texttt{Message-ID:} header. Apart from adding missing headers, rewriting headers is important too. Changing the locally known domain part of email addresses to globally known ones is an example. \masqmail\ needs to be able to rewrite the domain part dependent on the route used to send the message, to prevent messages to get classified as spam.
78 82
79 Generating the envelope is a related job. The envelope specifies the actual recipient of the mail, no matter what the \texttt{To:}, \texttt{Cc:}, and \texttt{Bcc:} headers contain. Multiple recipients lead to multiple different envelopes, containing all the same mail message. 83 Generating the envelope is a related job. The envelope specifies the actual recipient of the mail, no matter what the \texttt{To:}, \texttt{Cc:}, and \texttt{Bcc:} headers contain. Multiple recipients lead to multiple different envelopes, containing all the same mail message.
80 84
81 85
82 86
83 87
84 \paragraph{\RF4: Aliasing} 88 \paragraph{\RF4: Aliasing}
89 \label{rf4}
85 Email addresses can have aliases, thus they need to be expanded. Aliases can be of different kind: another local user, a remote user, a list containing local and remote users, or a command. Most important are the aliases in the \path{aliases} file, usually located at \path{/etc/aliases}. Addresses expanding to lists of users lead to more envelopes. Aliases changing the recipient's domain part may require a different route to be used. 90 Email addresses can have aliases, thus they need to be expanded. Aliases can be of different kind: another local user, a remote user, a list containing local and remote users, or a command. Most important are the aliases in the \path{aliases} file, usually located at \path{/etc/aliases}. Addresses expanding to lists of users lead to more envelopes. Aliases changing the recipient's domain part may require a different route to be used.
86 91
87 92
88 93
89 94
90 \paragraph{\RF5: Route management} 95 \paragraph{\RF5: Route management}
96 \label{rf5}
91 One key feature of \masqmail\ is its ability to send mail out over different routes. The online state defines the active route to be used. A specific route may not be suited for all messages, thus these messages are hold back until a suiting route is active. For more information on this concept see section \ref{sec:masqmail-routes}. 97 One key feature of \masqmail\ is its ability to send mail out over different routes. The online state defines the active route to be used. A specific route may not be suited for all messages, thus these messages are hold back until a suiting route is active. For more information on this concept see section \ref{sec:masqmail-routes}.
92 98
93 99
94 100
95 101
96 \paragraph{\RF6: Authentication} 102 \paragraph{\RF6: Authentication}
103 \label{rf6}
97 \label{requirement-authentication} 104 \label{requirement-authentication}
98 One thing to avoid is being an \name{open relay}. Open relays allow to relay mail from everywhere to everywhere. This is a source of spam. The solution is restricting relay\footnote{Relaying is passing mail, that is not from and not for the own system, through it.} access. It may also be wanted to refuse all connections to the \MTA\ except ones from a specific set of hosts. 105 One thing to avoid is being an \name{open relay}. Open relays allow to relay mail from everywhere to everywhere. This is a source of spam. The solution is restricting relay\footnote{Relaying is passing mail, that is not from and not for the own system, through it.} access. It may also be wanted to refuse all connections to the \MTA\ except ones from a specific set of hosts.
99 106
100 Several ways to restrict access are available. The most simple one is restriction by the \NAME{IP} address. No extra complexity is added this way but the \NAME{IP} addresses need to be static or within known ranges. This approach is often used to allow relaying for local nets. The access check can be done by the \MTA\ or by a guard (e.g.\ \NAME{TCP} \name{Wrappers}) before. The main advantage here is the minimal setup and maintenance work needed. This kind of access restriction is important to be implemented. 107 Several ways to restrict access are available. The most simple one is restriction by the \NAME{IP} address. No extra complexity is added this way but the \NAME{IP} addresses need to be static or within known ranges. This approach is often used to allow relaying for local nets. The access check can be done by the \MTA\ or by a guard (e.g.\ \NAME{TCP} \name{Wrappers}) before. The main advantage here is the minimal setup and maintenance work needed. This kind of access restriction is important to be implemented.
101 108
116 Static authentication is simpler and requires less administration work but is has limitations---dynamic authentication should be used if static authentication reaches a limit. At least one of the secret-based mechanisms should be supported. 123 Static authentication is simpler and requires less administration work but is has limitations---dynamic authentication should be used if static authentication reaches a limit. At least one of the secret-based mechanisms should be supported.
117 124
118 125
119 126
120 \paragraph{\RF7: Encryption} 127 \paragraph{\RF7: Encryption}
128 \label{rf7}
121 \label{requirement-encryption} 129 \label{requirement-encryption}
122 Electronic mail is vulnerable to sniffing attacks, because in generic \SMTP\ all data transfer is unencrypted. The message's body, the header, and envelope are all unencrypted, but also authentication dialogs that transfer plain text passwords (e.g.\ \NAME{PLAIN} and \NAME{LOGIN}). Hence encryption is throughout important. 130 Electronic mail is vulnerable to sniffing attacks, because in generic \SMTP\ all data transfer is unencrypted. The message's body, the header, and envelope are all unencrypted, but also authentication dialogs that transfer plain text passwords (e.g.\ \NAME{PLAIN} and \NAME{LOGIN}). Hence encryption is throughout important.
123 131
124 The common way to encrypt \SMTP\ dialogs is using \name{Transport Layer Security} (short: \TLS, the successor of \NAME{SSL}). \TLS\ encrypts the datagrams of the \name{transport layer}. This means it works below the application protocols and can be used with any of them \citeweb{wikipedia:tls}. 132 The common way to encrypt \SMTP\ dialogs is using \name{Transport Layer Security} (short: \TLS, the successor of \NAME{SSL}). \TLS\ encrypts the datagrams of the \name{transport layer}. This means it works below the application protocols and can be used with any of them \citeweb{wikipedia:tls}.
125 133
138 \NAME{STARTTLS}---defined in \RFC2487---is what \RFC3207 recommends to use for secure \SMTP. The connection then goes over port 25 (or the submission port 587), but gets encrypted as the \NAME{STARTTLS} keyword is issued. Email depends on compatibility---only encryption methods that client and server support can be used. Hence it is best to act after the recommendations of the \RFC\ documents. This means \NAME{STARTTLS} encryption should be supported for incoming and for outgoing connections. 146 \NAME{STARTTLS}---defined in \RFC2487---is what \RFC3207 recommends to use for secure \SMTP. The connection then goes over port 25 (or the submission port 587), but gets encrypted as the \NAME{STARTTLS} keyword is issued. Email depends on compatibility---only encryption methods that client and server support can be used. Hence it is best to act after the recommendations of the \RFC\ documents. This means \NAME{STARTTLS} encryption should be supported for incoming and for outgoing connections.
139 147
140 148
141 149
142 \paragraph{\RF8: Spam handling} 150 \paragraph{\RF8: Spam handling}
151 \label{rf8}
143 Spam is a major threat nowadays, but it is a war that is hard to win. The goal is to provide state-of-the-art spam protection, but not more (see section \ref{sec:swot-analysis}). 152 Spam is a major threat nowadays, but it is a war that is hard to win. The goal is to provide state-of-the-art spam protection, but not more (see section \ref{sec:swot-analysis}).
144 153
145 As spam is, by increasing the amount of mail messages, not just a nuisance for end users, but also for the infrastructure---the \mta{}s---they need to protect themselves. 154 As spam is, by increasing the amount of mail messages, not just a nuisance for end users, but also for the infrastructure---the \mta{}s---they need to protect themselves.
146 155
147 Filtering spam can be done by either refusing spam during the \SMTP\ dialog or by checking for spam after the mail was accepted and queued. Both ways have advantages and disadvantages, so modern \MTA{}s use them in combination. 156 Filtering spam can be done by either refusing spam during the \SMTP\ dialog or by checking for spam after the mail was accepted and queued. Both ways have advantages and disadvantages, so modern \MTA{}s use them in combination.
155 164
156 165
157 166
158 167
159 \paragraph{\RF9: Malware handling} 168 \paragraph{\RF9: Malware handling}
169 \label{rf9}
160 Related to spam is malicious content (short: \name{malware}) like viruses, worms, trojan horses. They, in contrast to spam, do not affect the \MTA\ itself, as they are in the mail's body. \MTA{}s searching for malware is equal to real world's post offices opening letters to check if they contain something that could harm the recipient. This is not a mail transport job. But by many people the \MTA\ which is responsible for the recipient is seen to be at a good position to do this work, so it is often done there. 170 Related to spam is malicious content (short: \name{malware}) like viruses, worms, trojan horses. They, in contrast to spam, do not affect the \MTA\ itself, as they are in the mail's body. \MTA{}s searching for malware is equal to real world's post offices opening letters to check if they contain something that could harm the recipient. This is not a mail transport job. But by many people the \MTA\ which is responsible for the recipient is seen to be at a good position to do this work, so it is often done there.
161 171
162 In any way should malware checking be performed by external programs that may be invoked by the \mta. But \NAME{MDA}s are better points to invoke content scanners. 172 In any way should malware checking be performed by external programs that may be invoked by the \mta. But \NAME{MDA}s are better points to invoke content scanners.
163 173
164 A popular email filter framework is \name{amavis} which integrates various spam and malware scanners. The common setup includes a receiving \MTA\ which sends it to \name{amavis} using \SMTP, \name{amavis} processes the mail and sends it then to a second \MTA\ that does the outgoing transfer. Having interfaces to such scanners is nice to have, though. (This setup with two \MTA\ instances is discussed in more detail in section \ref{sec:current-code-security}). 174 A popular email filter framework is \name{amavis} which integrates various spam and malware scanners. The common setup includes a receiving \MTA\ which sends it to \name{amavis} using \SMTP, \name{amavis} processes the mail and sends it then to a second \MTA\ that does the outgoing transfer. Having interfaces to such scanners is nice to have, though. (This setup with two \MTA\ instances is discussed in more detail in section \ref{sec:current-code-security}).
165 175
166 176
167 177
168 \paragraph{\RF10: Archiving} 178 \paragraph{\RF10: Archiving}
169 Mail archiving and auditability become more important as email establishes as technology for serious business communication. It is also a must for companies in many countries. 179 \label{rf10}
170 180 Mail archiving and auditability become more important as email establishes as technology for serious business communication. It is also a must for companies in many countries. In the United States, the \name{Sarbanes-Oxley Act} \cite{sox} covers this topic. But a dedicated archiving solution is advisable if archiving is of high importance.
171 << \textbf{SOX} >> %fixme: cite SOX
172 181
173 The ability to archive verbatim copies of every mail coming into and every mail going out of the system, with relation between them, appears to be a goal to achieve. 182 The ability to archive verbatim copies of every mail coming into and every mail going out of the system, with relation between them, appears to be a goal to achieve.
174 183
175 \postfix\ for example has a \texttt{always\_bcc} feature, to send a copy of every outgoing mail to a definable recipient. At least this functionality should be given, although a more complete approach, like \qmail\ provides, is preferable. \qmail\ is able to save copies of all sent and received messages and additionally complete \SMTP\ dialogs \cite[page~12]{sill02}. 184 \postfix\ for example has a \texttt{always\_bcc} feature, to send a copy of every outgoing mail to a definable recipient. At least this functionality should be given, although a more complete approach, like \qmail\ provides, is preferable. \qmail\ is able to save copies of all sent and received messages and additionally complete \SMTP\ dialogs \cite[page~12]{sill02}.
176 185
246 \masqmail's current architecture is monolithic like \sendmail's and \exim's. But more than the other two, is it one block of interweaved code. \exim\ has a highly structured code with many internal interfaces, a good example is the one for authentication ``modules''. %fixme: add ref 255 \masqmail's current architecture is monolithic like \sendmail's and \exim's. But more than the other two, is it one block of interweaved code. \exim\ has a highly structured code with many internal interfaces, a good example is the one for authentication ``modules''. %fixme: add ref
247 \sendmail\ provides now, with its \name{milter} interface, standardized connection channels to external modules. 256 \sendmail\ provides now, with its \name{milter} interface, standardized connection channels to external modules.
248 \masqmail\ has none of them; it is what \sendmail\ was in the beginning: a single large block. 257 \masqmail\ has none of them; it is what \sendmail\ was in the beginning: a single large block.
249 258
250 Figure \ref{fig:masqmail-arch} is a call graph generated from \masqmail's source code, excluding logging functions. It gives a impression of how interweaved the internals are. There are no compartments existent. 259 Figure \ref{fig:masqmail-arch} is a call graph generated from \masqmail's source code, excluding logging functions. It gives a impression of how interweaved the internals are. There are no compartments existent.
251 %fixme: what is included, what not?
252 260
253 \begin{figure} 261 \begin{figure}
254 \begin{center} 262 \begin{center}
255 \vspace*{2ex} 263 \vspace*{2ex}
256 %\includegraphics[scale=0.75]{img/callgraph.eps} 264 %\includegraphics[scale=0.75]{img/callgraph.eps}
257 \includegraphics[scale=0.75]{img/masqmail-3-omitlog5.eps} 265 %\includegraphics[scale=0.75]{img/masqmail-3-omitlog5.eps}
266 \includegraphics[scale=0.75]{img/bb.eps}
258 \end{center} 267 \end{center}
259 \caption{Internal structure of \masqmail, showed by a call graph. (Logging functions are excluded.)} 268 \caption{Internal structure of \masqmail, showed by a call graph. (Logging functions are ignored; test and \NAME{POP3} code is excluded.)}
260 %fixme: what else is excluded
261 \label{fig:masqmail-arch} 269 \label{fig:masqmail-arch}
262 \end{figure} 270 \end{figure}
263 271
264 \sendmail\ improved its old architecture by adding the milter interface, to include further functionality by invoking external programs. \exim\ was designed, and is carefully maintained, with a modular-like code structure in mind. \qmail\ started from scratch with a ``security-first'' approach, \postfix\ improved on it, and \name{sendmail X}/\name{MeTA1} tries to adopt the best of \qmail\ and \postfix\ to completely replace the old \sendmail\ architecture. \person{Hafiz} describes this evolution of \mta\ architecture very well \cite{hafiz05}. 272 \sendmail\ improved its old architecture by adding the milter interface, to include further functionality by invoking external programs. \exim\ was designed, and is carefully maintained, with a modular-like code structure in mind. \qmail\ started from scratch with a ``security-first'' approach, \postfix\ improved on it, and \name{sendmail X}/\name{MeTA1} tries to adopt the best of \qmail\ and \postfix\ to completely replace the old \sendmail\ architecture. \person{Hafiz} describes this evolution of \mta\ architecture very well \cite{hafiz05}.
265 273
295 303
296 304
297 \paragraph{\RF1: In/out channels} 305 \paragraph{\RF1: In/out channels}
298 The incoming and outgoing channels that \masqmail\ already has (depicted in figure \ref{fig:masqmail-channels} on page \pageref{fig:masqmail-channels}) are the ones required for an \MTA{}s at the moment. Support for other protocols seems not to be necessary at the moment, although new protocols and mailing concepts are likely to appear (see section \ref{sec:email-trends}). Today, other protocols are not needed, so \masqmail\ is regarded to fulfill \RF1. But as \masqmail\ has no support for adding further protocols, delaying the work to support them until they are widely used, appears to be the best strategy anyway. 306 The incoming and outgoing channels that \masqmail\ already has (depicted in figure \ref{fig:masqmail-channels} on page \pageref{fig:masqmail-channels}) are the ones required for an \MTA{}s at the moment. Support for other protocols seems not to be necessary at the moment, although new protocols and mailing concepts are likely to appear (see section \ref{sec:email-trends}). Today, other protocols are not needed, so \masqmail\ is regarded to fulfill \RF1. But as \masqmail\ has no support for adding further protocols, delaying the work to support them until they are widely used, appears to be the best strategy anyway.
299 307
300 << smtp submission >> %fixme 308 %fixme: << smtp submission >> %fixme
301 309
302 \paragraph{\RF2: Queuing} 310 \paragraph{\RF2: Queuing}
303 One single mail queue is used in \masqmail; it satisfies all current requirements. 311 One single mail queue is used in \masqmail; it satisfies all current requirements.
304
305 << persistence: DB >> %fixme
306 312
307 \paragraph{\RF3: Header sanitizing} 313 \paragraph{\RF3: Header sanitizing}
308 The envelope and mail headers are generated when the mail is put into the queue. The requirements are fulfilled. 314 The envelope and mail headers are generated when the mail is put into the queue. The requirements are fulfilled.
309 315
310 \paragraph{\RF4: Aliasing} 316 \paragraph{\RF4: Aliasing}
356 The maintainability of \masqmail\ is equivalent to other software of similar kind. Missing modularity and therefore more complexity makes the maintainer's work harder. Conditional compilation might be good for security, but \name{ifdef}s scattered throughout the source code is a pain for maintainability. In summary is \masqmail's maintainability bearable, like in average Free Software projects. 362 The maintainability of \masqmail\ is equivalent to other software of similar kind. Missing modularity and therefore more complexity makes the maintainer's work harder. Conditional compilation might be good for security, but \name{ifdef}s scattered throughout the source code is a pain for maintainability. In summary is \masqmail's maintainability bearable, like in average Free Software projects.
357 363
358 364
359 365
360 \paragraph{\RG6: Testability} 366 \paragraph{\RG6: Testability}
361 The testability suffers from missing modularity. Testing program parts is hard to do. Nevertheless, it is done by compiling parts of the source to special test programs. %fixme: what are the names? what do they test? 367 The testability suffers from missing modularity. Testing program parts is hard to do. Nevertheless, it is done by compiling parts of the source to two special test programs: One tests reading input from a socket, the other tests constructing messages and sending it directly. Neither is designed for automated testing of source parts, they are rather to help the programmer during development.
362 368
363 This kind of testing is only clean-room testing, so .... %fixme 369 Two additional scripts exist to send a set of mails to differend kinds of recipients. They can be used for automated testing, but both test only the complete system's function.
364 % XXX 370
371 %fixme: think about clean-room testing
365 372
366 \paragraph{\RG7: Performance} 373 \paragraph{\RG7: Performance}
367 The performance---efficiency---of \masqmail\ is good enough for its target field of operation, where this is a minor goal. 374 The performance---efficiency---of \masqmail\ is good enough for its target field of operation, where this is a minor goal.
368 375
369 \paragraph{\RG8: Availability} 376 \paragraph{\RG8: Availability}