docs/diploma

annotate thesis/tex/4-MasqmailsFuture.tex @ 163:5681a18270b5

new content about architecture; some restructuring
author meillo@marmaro.de
date Thu, 18 Dec 2008 13:39:23 +0100
parents 18b7b517e2dd
children a7fd6d974d3c
rev   line source
meillo@109 1 \chapter{\masqmail's present and future}
meillo@93 2
meillo@137 3 \section{Existing code base}
meillo@142 4 Here regarded is version 0.2.21 of \masqmail. This is the last version released by Oliver \person{Kurth}, and the basis for my thesis.
meillo@142 5
meillo@93 6
meillo@137 7 \subsubsection*{Features}
meillo@93 8
meillo@142 9 \masqmail\ accepts mail on the command line and via \SMTP. Mail queueing and alias expansion is supported. \masqmail\ is able to deliver mail to local mailboxes (in \name{mbox} or \name{maildir} format) or pass it to a \name{mail delivery agent} (like \name{procmail}). Mail destinated to remote locations is sent using \SMTP\ or can be piped to commands, being gatesways to \NAME{UUCP} or \NAME{FAX} for example.
meillo@93 10
meillo@142 11 Outgoing \SMTP\ connections feature \SMTP-\NAME{AUTH} and \SMTP-after-\NAME{POP} authentication, but incoming connections do not. Using wrappers for outgoing connections is supported. This offers a two way communication through a wrapper application like \name{openssl}.
meillo@137 12 %todo: what about SSL/TLS encryption?
meillo@93 13
meillo@142 14 \masqmail\ focuses on non-permanent online connections, thus a concept of online routes is used. One may configure any amount of routes to send mail. Each route can have criterias, like matching \texttt{From:} or \texttt{To:} headers, to determine if mail is allowed to be sent using it. Mail to destinations outside the local net gets queued until \masqmail\ is informed about the existance of a online connection.
meillo@142 15
meillo@137 16 The \masqmail\ executable can be called under various names for sendmail-compatibility reasons. This is organized by symbolic links with different names pointing to the \masqmail\ executable. The \sendmail\ names are \path{/usr/lib/sendmail} and \path{/usr/sbin/sendmail} because many programs expect the \mta\ to be located there. Further more \sendmail\ supports calling it with a different name instead of supplying command line arguments. The best known of this shortcuts is \path{mailq}, which is equivilent to calling it with the argument \verb+-bq+. \masqmail\ recognizes the names \path{mailq}, \path{smtpd}, \path{mailrm}, \path{runq}, \path{rmail}, and \path{in.smtpd}. The first two are inspired by \sendmail. Not implemented is the name \path{newaliases} because \masqmail\ does not generate binary representations of the alias file.\footnote{A shell script located named \path{newaliases}, that invokes \texttt{masqmail -bi}, can provide the command to satisfy other software needing it.} \path{hoststat} and \path{purgestat} are missing for sendmail-compatibility.
meillo@109 17 %masqmail: mailq, mailrm, runq, rmail, smtpd/in.smtpd
meillo@109 18 %sendmail: hoststat, mailq, newaliases, purgestat, smtpd
meillo@109 19
meillo@137 20 Additional to the \mta\ job, \masqmail\ also offers mail retrieval services with being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online route.
meillo@109 21
meillo@137 22
meillo@137 23
meillo@137 24 \subsubsection*{The code}
meillo@137 25
meillo@137 26 \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library however has higher precedence in linking.
meillo@137 27
meillo@137 28 The only mandatory dependency is \name{glib}---a cross-platform software utility library, originated in the \NAME{GTK+} project. It provides safer replacements for many standard library functions. It also offers handy data containers, easy-to-use implementations of data structures, and much more.
meillo@137 29
meillo@109 30
meillo@109 31 With \masqmail\ comes the small tool \path{mservdetect}; it helps setting up a configuration that uses the \name{mserver} system to detect the online state. Two other binaries get compiled for testing purposes: \path{readtest} and \path{smtpsend}. All three programms use \masqmail\ source code; they only add a file with a \verb+main()+ function each.
meillo@109 32
meillo@93 33
meillo@137 34 \masqmail\ does not provide an interface to plug in modules with additional functionality. There exists no add-on or module system. The code is only separated by function to the various source files. Some functional parts can be included or excluded by defining symbols. Adding maildir support at compile time, means giving the option \verb+--enable-maildir+ to the \path{configure} call. This preserves the concerning code to get removed by the preprocessor. Unfortunately the \verb+#ifdef+s are scattered through all the source, leading to a FIXME(holperig) code base.
meillo@93 35
meillo@132 36
meillo@132 37
meillo@132 38
meillo@132 39
meillo@146 40
meillo@146 41 \section{Requirements}
meillo@146 42
meillo@146 43 Following is a list of current and future requirements to make \masqmail\ ready for the future.
meillo@146 44
meillo@146 45
meillo@146 46 \subsubsection*{Large message handling}
meillo@155 47 Trends in the market for electronic communication go towards consolidated communication, hence email will be used more to transfer voice and video messages. This leads to larger messages. The store-and-forward transport of email is not good suited for large data. Thus new protocols, like \NAME{QMTP} (described in section %\ref{FIXME}
meillo@155 48 ), may become popular.
meillo@146 49
meillo@146 50
meillo@146 51 \subsubsection*{Ressource friendly software}
meillo@149 52 The merge of communication hardware and the move of email services from providers to homes, demands smaller and more resource-friendly software. The amount of mail will be lower, even if much more mail will be sent. More important will be the energy consumption and heat emission. These topics increased in relevance during the past years and they are expected to become more central. \masqmail\ is not a program to be used on large servers, but to be used on small devices. Thus focusing on energy and heat, not on performance, is the direction to go.
meillo@146 53
meillo@146 54
meillo@146 55 \subsubsection*{New mail transfer protocols}
meillo@149 56 Large messages demand more efficient transport through the net. As well is a final solution needed to defeat the spam problem. New mail transport protocols may be the only good solutions for both problems. They also can improve reliability, authentication, and verification issues. \masqmail\ should be able to support new protocols as they appear and are used.
meillo@146 57
meillo@146 58
meillo@149 59 \subsubsection*{Spam handling}
meillo@149 60 Spam is a major threat. According to the \NAME{SWOT} analysis, the goal is to reduce it to a bearable level. Spam fighting is a war are where the good guys tend to lose. Putting too much effort there will result in few gain. Real success will only be possible with new---better---protocols and abandonning the weak legacy technologies. Hence \masqmail\ should be able to provide state-of-the-art spam protection, but not more.
meillo@146 61
meillo@146 62
meillo@161 63 \subsubsection*{Security}
meillo@161 64 \MTA{}s are critical points for computer security, as they are accessable from external networks. They must be secured with high effort. Properties like high priviledge level, work load influenced from extern, work on unsafe data, and demand for reliability, increase the security needed. Unsecure and unreliable \mta{}s are of no value. \masqmail\ needs to b e secure enough for its target field of operation.
meillo@161 65
meillo@161 66
meillo@146 67 \subsubsection*{Easy configuration}
meillo@149 68 Having \mta{}s on many home servers and clients, requires easy and standardized configuration. The common setups should be configurable with single actions by the user. Complex configuration should be possible, but focused must be the most common form of configuration: choosing one of several standard setups.
meillo@146 69
meillo@146 70
meillo@146 71
meillo@146 72
meillo@146 73
meillo@146 74
meillo@161 75 \section{Discussion on architecture}
meillo@146 76
meillo@163 77 A program's architecture is probably the most influencing design decision, and has the greatest impact on the program's future capabilities. %fixme: search quote ... check if good
meillo@132 78
meillo@161 79 \masqmail's current artitecture is monolitic like \sendmail's and \exim's. But more than the other two, is it one block of interweaved code. \sendmail\ provides now, with its \name{milter} interface, standardized connection channels to external modules. \exim\ has a highly structured code with many internal interfaces, like the one for supported authentication ``modules''. \masqmail\ has none of them; it is what \sendmail\ was in the beginning: a single large block.
meillo@161 80
meillo@161 81 Figure \ref{fig:masqmail-arch} is an attempt to depict \masqmail's internal structure.
meillo@161 82
meillo@161 83 \begin{figure}
meillo@161 84 \begin{center}
meillo@161 85 \input{input/masqmail-arch.tex}
meillo@161 86 \end{center}
meillo@161 87 \caption{Internal architecture of \masqmail}
meillo@161 88 \label{fig:masqmail-arch}
meillo@161 89 \end{figure}
meillo@161 90
meillo@163 91 \sendmail\ improved its old architecture, for example by adding the milter interface. \exim\ was designed and is carefully maintained with a modular-like code structure in mind. \qmail\ started from scratch with a ``security-first'' approach, \postfix\ improved on it, and \name{sendmail X}/\name{MeTA1} tries to adopt the best of \qmail\ and \postfix, to completely replace the old \sendmail\ architecture. \person{Hafiz} \cite{hafiz05}. describes this evolution of \mta\ architecture very well.
meillo@161 92
meillo@163 93 Every one of the popular \MTA{}s is more modular, or became more modular over time, than \masqmail\ is. Modern requirements like spam protection and future requirements like the use of new mail transport protocols demand modular designs for keeping the software simple. Simplicity is a key property for security.
meillo@161 94
meillo@163 95 \person{Hafiz} agrees:
meillo@163 96 \begin{quote}
meillo@163 97 The goal of making software secure can be better achieved by making the design simple and easier to understand and verify. \cite[page64]{hafiz05}
meillo@163 98 \end{quote}
meillo@163 99 He identifies the security of \qmail\ to come from it's \name{compartmentalization}, which goes hand in hand with modularity:
meillo@163 100 \begin{quote}
meillo@163 101 A perfect example is the contrast between the feature envy early \sendmail\ architecture implemented as one process and the simple, modular architecture of \qmail. The security of \qmail\ comes from its compartmentalized simple processes that perform one task only and are therefor testable for security. \cite[page 64]{hafiz05}
meillo@163 102 \end{quote}
meillo@161 103
meillo@163 104 Modularity is needed for supporting modern \MTA\ requirements, providing a clear interface to add further functionality without increasing the overall complexity much. Modularity is also an enabler for security. Security comes from good design, as \person{Graff} and \person{van Wyk} explain:
meillo@163 105 \begin{quote}
meillo@163 106 Good design is the sword and shield of the security-conscious developer. Sound design defends your application from subversion or misuse, protecting your network and the information on it from internal and external attacks alike. It also provides a safe foundation for future extensions and maintainance of the software.
meillo@163 107 %
meillo@163 108 %Bad design makes life easier for attackers and harder for the good guys, especially if it contributes to a false sends of security while obscuring pertinent failings.
meillo@163 109 \cite[page 55]{graff03}
meillo@163 110 \end{quote}
meillo@161 111
meillo@163 112 \person{Hafiz} adds: ``The major idea is that security cannot be retrofitted into an architecture.''\cite[page 64]{hafiz05}
meillo@161 113
meillo@163 114 All this leads to one logical step: The rewrite of \masqmail\ using a modern, modular architecture, to get a modern \MTA\ satisfying nowadays needs.
meillo@161 115
meillo@161 116
meillo@161 117
meillo@161 118
meillo@163 119 \subsection{Modules needed}
meillo@161 120
meillo@163 121 This section tries to identify the needed modules for a modern \MTA. They are later the pieces of which the new architecture is built of.
meillo@163 122
meillo@163 123
meillo@163 124 \subsubsection*{The simplest MTA}
meillo@163 125 This view of the problem is taken from \person{Hafiz} \cite[pages 3-5]{hafiz05}.
meillo@163 126
meillo@163 127 The basic job of a \mta\ is to tranport mail from a sender to a recipient. The simplest \MTA\ therefor needs at least a mail receiving facility and a mail sending facility. This basic \MTA---following the definition of an \MTA---is much to abstract. Hence a next step to add some important features is needed, the result is an operational \MTA.
meillo@163 128
meillo@163 129
meillo@163 130
meillo@163 131 \subsubsection*{Mail queue}
meillo@163 132
meillo@163 133 \person{Hafif} adds a mail queue to make it possible to not deliver at once.
meillo@163 134
meillo@163 135 Mail queues are probably used in all \mta{}s, excluding the simple forwarders. A mail queue is a essential requirement for \masqmail, as it is to be used for non-permanent online connections.
meillo@163 136
meillo@163 137
meillo@163 138 \subsubsection*{Incoming channels}
meillo@163 139
meillo@163 140 The second addition \person{Hafiz} made is the split of incoming and outgoing channels into local and remote. The question is, if this is nessesary. It is the way, it was done for a long time, but is this extra complexity needed?
meillo@163 141
meillo@163 142 The common situation is incoming mail on port 25 using \SMTP\ and via the \texttt{sendmail} command. Outgoing mail is either sent using \SMTP, piped into local commands (for example \texttt{uucp}), or delivered locally by appending to a mailbox.
meillo@163 143
meillo@163 144 The \MTA's architecture would be simpler if some of these channels could be merged. The reason is, if various modules do similar jobs, common things might need to be duplicated. On the other side is it better to have more independent modules if each one is simpler then.
meillo@163 145
meillo@163 146 \qmail\ uses \name{qmail-inject} (local message in) and \name{qmail-smtpd} (remote message in), which both handle messages over to \name{qmail-queue} that puts it into the mail queue. \postfix's approach is similar. \name{sendmail X} %fixme: what about meta1 here?
meillo@163 147 used only \NAME{SMTPS}, which is for receiving mail from remote, to communicate with the queue manager \NAME{QMGR}. Mail from local goes over \NAME{SMTPS}.
meillo@163 148
meillo@163 149 The \name{sendmail X} approach seems to be the simpler one, but does heavily rely on \SMTP\ being the main mail transfer protocol. To \qmail\ and \postfix\ new modules may be added to support other ways of message receival, without any change of other parts of the system.
meillo@163 150
meillo@163 151
meillo@163 152 \subsubsection*{Outgoing channels}
meillo@163 153
meillo@163 154 Outgoing channels are similar for \qmail, \postfix, and \name{sendmail X}: All of them have a module to send mail using \SMTP, and one for writing into a local mailbox. Local mail delivery is a job that requires root priveledge to be able to switch to any user in order to write to his mailbox. Modular \MTA{}s do not need \name{setuid root}, but the local delivery process (or its parent) needs to run as root.
meillo@163 155
meillo@163 156 As mail delivery to local users, is \emph{not} included in the basic job of \MTA{}s, why should they care about it? In order to keep the system simple and to have programs do one job well, the local delivery job should be handed over to \NAME{MDA}s. \name{Mail delivery agents} are the tools that are specialized for local delivery. They know about the various mailbox formats and are aware of the problems of concurrent write access and thelike. Hence handling the message and the responsiblity for it over to a mail delivery agent, like \name{procmail} or \name{maildrop}, seems to be the right way to go.
meillo@163 157
meillo@163 158 This means outgoing connections, piping mails into local commands needs to be implemented.
meillo@163 159
meillo@163 160
meillo@163 161 \subsubsection*{Mail queue (again)}
meillo@163 162
meillo@163 163
meillo@163 164
meillo@163 165
meillo@163 166 \subsubsection*{Authentication}
meillo@163 167
meillo@163 168 easiest: restricting by static IP addresses (Access control via hosts.allow/hosts.deny)
meillo@163 169 if dynamic remote hosts need access: some auth is needed
meillo@163 170 - SASL
meillo@163 171 - POP/IMAP: pop-before-smtp, DRAC, WHOSON
meillo@163 172 - TLS (certificates)
meillo@163 173
meillo@163 174 ``None of these add-ons is an ideal solution. They require additional code compiled into your existing daemons that may then require special write accesss to system files. They also require additional work for busy system administrators. If you cannot use any of the nonauthenticating alternatives mentioned earlier, or your business requirements demand that all of thyour users' mail pass through your system no matter where they are on the Internet, SASL is probably the solution that offers the most reliable and scalable method to authenticate users.'' (Dent: Postfix, page 44, ch04)
meillo@163 175
meillo@163 176
meillo@163 177 \subsubsection*{Encryption}
meillo@163 178
meillo@163 179
meillo@163 180 \subsubsection*{Spam prevention}
meillo@163 181
meillo@163 182
meillo@163 183 where to filter what
meillo@163 184
meillo@163 185
meillo@163 186 postfix: after-queue-content-filter (smtp communication)
meillo@163 187 exim: content-scan-feature
meillo@163 188 sendmail: milter (tcp or unix sockets)
meillo@163 189
meillo@163 190 checks while smtp dialog (pre-queue): in MTA implemented (need to be fast)
meillo@163 191 checks when mail is accepted and queued: external (amavis, spamassassin)
meillo@163 192
meillo@163 193
meillo@163 194 AMaViS (amavisd-new): email filter framework to integrate spam and virus scanner
meillo@163 195 internet -->25 MTA -->10024 amavis -->10025 MTA --> reciptient
meillo@163 196 | |
meillo@163 197 +----------------------------+
meillo@163 198 mail scanner:
meillo@163 199 incoming queue --> mail scanner --> outgoing queue
meillo@163 200
meillo@163 201 mimedefang: uses milter interface with sendmail
meillo@163 202
meillo@163 203
meillo@163 204 \subsubsection*{Virus checking}
meillo@163 205
meillo@163 206 The same for malicious content (\name{malware}) like viruses, worms, trojan horses. They are related to spam, but affect the \MTA less, as they are in the mail body.
meillo@163 207
meillo@163 208 message body <-> envelope, header
meillo@163 209
meillo@163 210
meillo@163 211 anti-virus: clamav
meillo@163 212
meillo@163 213
meillo@163 214
meillo@163 215
meillo@163 216
meillo@163 217 \subsubsection*{Archiving}
meillo@163 218
meillo@163 219
meillo@163 220
meillo@163 221
meillo@163 222
meillo@163 223 \section{A new architecture}
meillo@161 224
meillo@161 225
meillo@161 226 (ssl)
meillo@161 227 -> msg-in (local or remote protocol handlers)
meillo@161 228 -> spam-filter (and more)
meillo@161 229 -> queue
meillo@161 230 -> msg-out (local-delivery by MDA, or remote-protocol-handlers)
meillo@161 231 (ssl)
meillo@161 232
meillo@161 233
meillo@161 234
meillo@161 235
meillo@161 236
meillo@161 237 http://fanf.livejournal.com/50917.html %how not to design an mta - the sendmail command
meillo@161 238 http://fanf.livejournal.com/51349.html %how not to design an mta - partitioning for security
meillo@161 239 http://fanf.livejournal.com/61132.html %how not to design an mta - local delivery
meillo@161 240 http://fanf.livejournal.com/64941.html %how not to design an mta - spool file format
meillo@161 241 http://fanf.livejournal.com/65203.html %how not to design an mta - spool file logistics
meillo@161 242 http://fanf.livejournal.com/65911.html %how not to design an mta - more about log-structured MTA queues
meillo@161 243 http://fanf.livejournal.com/67297.html %how not to design an mta - more log-structured MTA queues
meillo@161 244 http://fanf.livejournal.com/70432.html %how not to design an mta - address verification
meillo@161 245 http://fanf.livejournal.com/72258.html %how not to design an mta - content scanning
meillo@161 246
meillo@161 247
meillo@161 248
meillo@132 249
meillo@132 250
meillo@137 251
meillo@137 252
meillo@149 253
meillo@149 254
meillo@149 255
meillo@149 256
meillo@149 257
meillo@149 258
meillo@149 259
meillo@149 260
meillo@93 261
meillo@93 262
meillo@99 263
meillo@93 264
meillo@93 265
meillo@161 266 \section{Directions to go}
meillo@161 267
meillo@161 268 This section discusses about what shapes \masqmail\ could have---which directions the development could go to.
meillo@161 269
meillo@93 270
meillo@146 271
meillo@146 272
meillo@146 273
meillo@146 274 \subsubsection*{\masqmail\ in five years}
meillo@146 275
meillo@146 276 Now how could \masqmail\ be like in, say, five years?
meillo@146 277
meillo@163 278 ---
meillo@163 279
meillo@163 280 A design from scratch?
meillo@163 281 << what would be needed (effort) >>
meillo@163 282 But how is the effort of this complete rewrite compared to what is gained afterwards?
meillo@163 283
meillo@163 284 << would one create it at all? >>
meillo@163 285
meillo@163 286 ---
meillo@163 287
meillo@146 288 << plans to get masqmail more popular again (if that is the goal) >>
meillo@146 289
meillo@146 290 << More users >>
meillo@146 291
meillo@146 292
meillo@146 293
meillo@146 294
meillo@163 295
meillo@163 296
meillo@163 297
meillo@93 298 \section{Work to do}
meillo@93 299
meillo@146 300 << short term goals --- long term goals >>
meillo@146 301
meillo@163 302 do it like sendmail: first do the most needed stuff on the old design to make it still usable. Then design a new version from scratch, for the future.
meillo@163 303
meillo@140 304 << which parts to take out and do within the thesis >>
meillo@93 305