docs/diploma

annotate thesis/tex/3-MailTransferAgents.tex @ 312:a62fe460b8de

work in MTA comparison
author meillo@marmaro.de
date Tue, 20 Jan 2009 21:55:22 +0100
parents 273f2d174315
children a3fba017ef01
rev   line source
meillo@89 1 \chapter{Mail transfer agents}
meillo@254 2 \label{chap:mail-transfer-agents}
meillo@89 3
meillo@217 4 After having analyzed the market for electronic mail and identified upcoming trends, in the last chapter; this chapter takes a look at \mta{}s---the intelligent nodes and thus the most important parts of the email infrastructure. The \MTA{}s will be grouped by similarities first. Then the four most popular \freesw\ \mta{}s, will be presented to the reader in a short overview and with the most important facts. At the end of this chapter these programs will be compared.
meillo@89 5
meillo@117 6
meillo@89 7
meillo@89 8
meillo@120 9 \section{Types of MTAs}
meillo@217 10 ``Mail transfer agent'' is a term covering a variety of programs. One thing is common to them: they transfer email from one senders to recipients.
meillo@89 11
meillo@248 12 This is how \person{Bryan Costales} defines a \mta:
meillo@117 13 \begin{quote}
meillo@217 14 A mail transfer agent (\MTA) is a highly specialized program that delivers mail and transports it between machines, like the post office.
meillo@218 15 \hfill\cite{costales97}
meillo@117 16 \end{quote}
meillo@217 17 \name{The Free Dictionary} is a bit more concrete on the term:
meillo@117 18 \begin{quote}
meillo@217 19 Message Transfer Agent - (\MTA, Mail Transfer Agent): Any program responsible for delivering e-mail messages. Upon receiving a message from a Mail User Agent or another \MTA, [...] it [...] delivers it to any local addressees and/or forwards it to other remote \MTA{}s (routing) for delivery to remote recipients.
meillo@218 20 \hfill\citeweb{website:thefreedictionary}
meillo@117 21 \end{quote}
meillo@89 22
meillo@259 23 \person{Dent} and \person{Hafiz} agree \cite[page 19]{dent04} \cite[pages 3-5]{hafiz05}.
meillo@259 24
meillo@259 25 Common to all \MTA{}s is the transport of mail; this is the actual job. Besides this similarity, \MTA{}s can be very different. Some of them have \NAME{POP3} and/or \NAME{IMAP} servers included. Some can fetch mails through these protocols. Others have have all features you can think of. And maybe there are some that do nothing else but transporting email.
meillo@89 26
meillo@117 27 Following is a classification of \mta{}s into groups of similar programs, regarding what is viewable from the outside.
meillo@117 28
meillo@117 29
meillo@120 30 \subsubsection*{Relay-only MTAs}
meillo@89 31 \label{subsec:relay-only}
meillo@217 32 Also called \name{forwarders}. This is the most simple kind of \MTA. It transfers mail only to defined \name{smart hosts}\footnote{\name{smart host}s are \MTA{}s that receives email and route it to the actual destination}. \name{Relay-only} \MTA{}s do not receive mail from outside the system, and they do not deliver locally. All they do is transfer mail to a specified smart host for further relay.
meillo@89 33
meillo@89 34 Most \MTA{}s can be configured to act as such a \name{forwarder}. But this is usually an additional functionality.
meillo@89 35
meillo@217 36 One uses this kind of \MTA\ to give a system the possibility to send mail, without the need to do lots of configuration. In a local network, usually the clients are set up with relay-only \MTA{}s, while there is one mail server that acts as a \name{smart host}. The ``dumb'' clients send mail to this \name{mail server} which does all further work.
meillo@89 37
meillo@217 38 Example programs in that group are: \name{nullmailer}, \name{ssmtp} and \name{esmtp}.
meillo@89 39
meillo@89 40
meillo@117 41 \subsubsection*{Groupware}
meillo@217 42 Normally the term ``groupware'' does not mean one single program, but a suite of programs. They build a framework which is then populated with various modules that provide the actual functionality. Modules for mail transfer, file storage, calendars, resource management, instant messaging, and more, are commonly available.
meillo@89 43
meillo@217 44 These program suites are used if the main work to do is providing integrated communication facilities and team working support for a group of people. Mail transfer is only one part of the problem to solve. The most common scenario are companies. They have \name{groupware} running to provide adequate services for their teams to work efficiently. But one may use \name{groupware} on the home server for his family members also.
meillo@89 45
meillo@217 46 Examples for groupware are: \name{Lotus Notes}, \name{Microsoft Exchange}, \name{OpenGroupware.org}, and \name{eGroupWare}.
meillo@89 47
meillo@89 48
meillo@120 49 \subsubsection*{``Real'' MTAs}
meillo@217 50 There is a third type of \mta{}s in between the minimalistic \name{relay-only} \MTA{}s and the feature loaded \name{groupware}. Those programs may be named ``real \MTA{}s'', or ``proper \MTA{}s'', though there is no common name. They are what is meant with the term ``\mta''---programs that transfer mail between hosts.
meillo@89 51
meillo@224 52 Common to them is their focus on transferring email, while being able to act as \name{smart host}s. Their variety ranges from ones mostly restricted to mail transfer (e.g.\ \qmail) to others having interfaces for adding further mail processing modules (e.g.\ \postfix). This group covers everything in between the other two groups.
meillo@89 53
meillo@265 54 ``Real \MTA{}s'' include \sendmail, \exim, \qmail, and \postfix.
meillo@89 55
meillo@89 56
meillo@117 57 \subsubsection*{Other segmenting}
meillo@124 58 \name{Mail transfer agents} can also be split in other ways.
meillo@308 59
meillo@308 60 Due to \sendmail's significance in the early times of email, compatibility interfaces for \sendmail\ are important for \unix\ \MTA{}s. The reason is that many mail applications simply the \sendmail\ \MTA\ to be installed on the system. Being not \emph{sendmail-compatible} may not matter for some fields of action, but makes the program ineligible for serving as a general purpose \MTA\ on \unix\ systems. Hence being sendmail-compatible is a major property of a \mta. %todo: how many MTAs are sendmail-compatible?
meillo@124 61 \MTA{}s not having a \emph{sendmail-compatible} interface or not offering it as a compatibility add-on, will not be covered here. One example for such a program is \name{Apache James}. %FIXME: check if correct
meillo@89 62
meillo@217 63 Another separation can be done between \freesw\ \MTA{}s and proprietary ones. Many of the \MTA{}s for \unix\ systems are \freesw. Only these are regarded in the following sections, because comparing \freesw\ with proprietary or commercial software is not what typical users of programs like \masqmail\ do. %fixme: what are typical users?
meillo@217 64 Comparison with non-free programs may be a point for large \freesw\ projects, trying to step into the business world. Small projects, mostly used by individuals at home, %fixme: is this the right target field? see chap02
meillo@217 65 need to be compared against other projects of similar shape. The document is seen from \masqmail's point of view---an \MTA\ for \unix\ systems on home servers and workstations---so non-free software is out of the way.
meillo@89 66
meillo@89 67
meillo@89 68
meillo@89 69
meillo@265 70
meillo@265 71
meillo@265 72 \subsubsection*{\masqmail's position}
meillo@265 73
meillo@265 74 Now, where does \masqmail\ fit in? It is not groupware nor a simple forwarder, thus it belongs to the ``real \MTA{}s''. Additionally it is Free Software and is intended to be sendmail-compatible. This makes it similar to \sendmail, \exim, \qmail, and \postfix. \masqmail\ is intended to be a replacement for those \MTA{}s.
meillo@265 75
meillo@265 76 But: It was not designed to be used as a general replacement for them (see: section \ref{sec:masqmail-target-field}). In fact, \masqmail\ is only a replacement \emph{in some situations}. This primary excludes working in an untrusted environment.
meillo@265 77
meillo@265 78
meillo@265 79
meillo@265 80
meillo@265 81
meillo@265 82
meillo@265 83
meillo@265 84
meillo@265 85
meillo@265 86
meillo@120 87 \section{Popular MTAs}
meillo@89 88
meillo@308 89 This section introduces a selection of popular \MTA{}s; they are the most likely substitutes for \masqmail. All are sendmail-compatible ``smart'' \freesw\ \MTA{}s that focus on mail transfer, as is \masqmail.
meillo@89 90
meillo@217 91 The programs chosen to be compared, with each other and with \masqmail, are: \sendmail, \exim, \qmail, and \postfix. They are the most important representatives of the regarded group.
meillo@117 92
meillo@145 93
meillo@145 94 \subsection{Market share analysis}
meillo@145 95
meillo@217 96 \MTA\ statistics are rare, differ, and good data is hard to collect. These points are bad if one wants good statistics. Thus it is obvious there are only few available.
meillo@217 97
meillo@248 98 Table \ref{tab:mta-market-share} shows the most used \MTA{}s determined by three different statistics. The first was done by \person{Daniel~J.\ Bernstein} (the author of \qmail) in 2001 \cite{bernstein01}. The second is by \person{Simpson} and \person{Bekman} in 2007 and was published on \name{O'ReillyNet} \cite{simpson07}. And the third is from \name{MailRadar.com} with unknown date\footnote{The footer of the website shows ``Copyright 2007'' but more likely does this refer to the whole website.} \citeweb{mailradar:mta-stats}.
meillo@117 99
meillo@130 100 \begin{table}
meillo@130 101 \begin{center}
meillo@271 102 \input{tbl/mta-market-share.tbl}
meillo@130 103 \end{center}
meillo@130 104 \caption{Market share of \MTA{}s}
meillo@130 105 \label{tab:mta-market-share}
meillo@130 106 \end{table}
meillo@89 107
meillo@217 108 All surveys show high market shares for the four \MTA{}s: \sendmail, \exim, \qmail, and \postfix. Only the \name{Microsoft} mail server software and \name{IMail} have comparable large shares. Other \freesw\ \mta{}s (\name{smail}, \name{zmailer}, \name{MMDF}, \name{courier-mta}) are less important and seldom used.
meillo@130 109
meillo@217 110 The three surveys base on different data. \person{Bernstein} took 1\,000\,000 randomly chosen \NAME{IP} addresses, containing 39\,206 valid hosts; 958 of them accepted \NAME{SMTP} connections. The \person{Simpson} and \person{Bekman} survey used only domains owned by companies; in total 400\,000 hosts. \name{MailRadar} scanned 2\,818\,895 servers, leading to 59\,209 accepted connections.
meillo@130 111
meillo@225 112 All surveys show \sendmail\ to be the most popular \MTA. \postfix, \qmail, and \exim\ are among the best seven in each. \exim\ has slightly smaller shares than the other two. The four together share more than half of the market according to \person{Bernstein} and the \name{MailRadar} statistics. \person{Simpson} and \person{Bekman} have their share to be somewhere between a third and the half. This uncertainty comes from the large amount of unidentifiable \MTA{}s.
meillo@143 113
meillo@225 114 The 22 percent of \name{mail security layers} in the \name{O'Reilly} survey is remarkable. Mail security layers are software guards between the network and the \mta\ that filter unwanted mail before it reaches the \MTA. This increases security by filtering malicious content and by blocking attacks against the \MTA. This large share may be a result of only regarding business mail servers. The problem concerning the survey is the disguise of the \mta\ working behind the security layer. It seems wrong to assume equal shares for the \MTA{}s behind the guards as for the unguarded \MTA{}s, because mail security layers will be more often used to guard weak \MTA{}s, as strong ones do not need them so much. This needs to be kept in mind when using the \name{O'Reilly} survey.
meillo@145 115
meillo@225 116 The date of the \name{Mailradar} statistics is not mentioned with it; a mail to \name{Mailradar} asking for information was not replied, unfortunately. However, it seems quite sure that the statistics were published after 2001, caused by the \sendmail\ and \postfix\ shares. But to decide whether before or after the one from \name{O'Reilly} would be just guessing.
meillo@145 117
meillo@145 118
meillo@145 119 \subsection{The four major Free Software MTAs}
meillo@143 120
meillo@248 121 Now follows a small introduction to the four programs chosen for comparison. \masqmail\ is not presented here, as it was already introduced in chapter \ref{chap:introduction}. Longer introductions, including analysis and comparison, were written by \person{Jonathan de Boyne Pollard} \cite{jdebp}.
meillo@89 122
meillo@117 123
meillo@117 124
meillo@120 125 \subsubsection*{sendmail}
meillo@89 126 \label{sec:sendmail}
meillo@217 127 \sendmail\ is the best known \mta, since it was one of the first and surely the one that made \MTA{}s popular. It also was shipped as default \MTA{}s by many vendors of \unix\ systems. %fixme: ref
meillo@89 128
meillo@248 129 The program was written by \person{Eric Allman} as the successor of his program \name{delivermail}. \person{Allman} was not the only one working on the program. Other people developed own versions of it and a variety of flavors came up, especially in the late eighties when Allman was inactive. %fixme: ref
meillo@89 130
meillo@224 131 \sendmail\ designed to transfer mails between different protocols and networks, this lead to a very flexible, though complex, configuration.
meillo@89 132
meillo@312 133 It was first released with \NAME{BSD} 4.1c in 1983.
meillo@312 134 %todo: write about its importance and about sendmail-compat
meillo@312 135
meillo@312 136 The latest version is 8.14.3 from May 2008. The program is distributed under the \name{Sendmail License} as both, \freesw\ and proprietary software.
meillo@89 137
meillo@128 138 Further development will go into the project \name{MeTA1} (the former name was \name{sendmail X}) which succeeds \sendmail.
meillo@89 139
meillo@217 140 More information can be found on the \sendmail\ homepage \citeweb{sendmail:homepage} and in the, so called, ``Bat Book'' \cite{costales97}.
meillo@89 141
meillo@89 142
meillo@117 143
meillo@120 144 \subsubsection*{exim}
meillo@117 145 \label{sec:exim}
meillo@248 146 \exim\ was started in 1995 by \person{Philip Hazel} at the \name{University of Cambridge}. It is a fork of \name{smail-3}, and inherited a monolithic architecture similar to \sendmail's. But having no separation of the individual components of the system did not hurt. Its security is quite good. %fixme: ref
meillo@117 147
meillo@217 148 \exim\ is highly configurable, especially in the field of mail policies. This makes it easy to specify how mail is routed through the system and who is allowed to send email to whom. Also interfaces for integration of virus and spam checkers are provided by design. %fixme: ref
meillo@117 149
meillo@117 150 The program is \freesw, released under the \GPL. The latest stable version is 4.69 from December 2007.
meillo@117 151
meillo@217 152 One finds \exim\ on its homepage \citeweb{exim:homepage}. The standard literature is \person{Hazel}'s \exim\ book \cite{hazel01}.
meillo@117 153
meillo@117 154
meillo@117 155
meillo@120 156 \subsubsection*{qmail}
meillo@89 157 \label{sec:qmail}
meillo@248 158 \qmail\ is seen by its community as ``a modern SMTP server which makes sendmail obsolete'' \citeweb{qmail:homepage2}. It was written by \person{Daniel~J.\ Bernstein} starting in 1995. His primary goal was to create a secure \MTA\ to replace the popular, but vulnerable, \sendmail. %fixme: ref
meillo@89 159
meillo@223 160 \qmail\ first introduced many innovative concepts in \mta\ design. The most obvious contrast to \sendmail\ and \exim\ is its modular design. But \qmail\ was not the first modular \MTA. \NAME{MMDF}, which predates even \sendmail, was modular too. Regardless of \NAME{MMDF}'s modular architecture, \qmail\ is generally seen as the first security-aware \MTA. %fixme:ref
meillo@89 161
meillo@225 162 The latest release of \qmail\ is version 1.03 from July 1998. In November 2007, afterwards, \qmail's source was put into the \name{public domain}. This makes it Free Software.
meillo@89 163
meillo@223 164 Because of \person{Bernstein}'s inactivity though changing requirements since 1998, ``[a] motley krewe of qmail contributors (see the README) has put together a netqmail-1.06 distribution of qmail. It is derived from Daniel Bernstein's qmail-1.03 plus bug fixes, a few feature enhancements, and some documentation.'' \citeweb{netqmail:homepage}.
meillo@223 165
meillo@248 166 \qmail's homepages are \citeweb{qmail:homepage1} and \citeweb{qmail:homepage2}. The best book about \qmail, from \person{Bernstein}'s view, is \person{Dave Sill}'s handbook \cite{sill02}. His free available guide ``Life with qmail'' is another valuable source \cite{lifewithqmail}.
meillo@89 167
meillo@89 168
meillo@117 169
meillo@120 170 \subsubsection*{postfix}
meillo@89 171 \label{sec:postfix}
meillo@248 172 The \postfix\ project started in 1999 at \name{IBM research}, then called \name{VMailer} or \name{IBM Secure Mailer}. \person{Wietse Venema}'s program ``attempts to be fast, easy to administer, and secure. The outside has a definite Sendmail-ish flavor, but the inside is completely different.''\citeweb{postfix:homepage} In fact, \postfix\ was mainly designed after qmail's architecture to gain security. But in contrast to \qmail\ it aims much more on being fast and full-featured.
meillo@89 173
meillo@132 174 Today \postfix\ is taken by many \unix\ systems and \gnulinux\ distributions as default \MTA.
meillo@89 175
meillo@312 176 The latest stable version is numbered 2.5.6 from December 2008. \postfix\ is covered by the \name{IBM Public License 1.0} which is a \freesw\ license.
meillo@89 177
meillo@217 178 Additional information can be retrieved from the program's homepage \citeweb{postfix:homepage}. \person{Dent}'s \postfix\ book \cite{dent04} claims to be ``the definitive guide'', and it is.
meillo@89 179
meillo@89 180
meillo@89 181
meillo@89 182
meillo@89 183
meillo@89 184
meillo@120 185 \section{Comparison of MTAs}
meillo@308 186 \label{sec:mta-comparison}
meillo@89 187
meillo@312 188 This section does not try to provide a throughout \MTA\ comparison, because this is already done by others. Remarkable comparisons are the one by \person{Dan Shearer} \cite{shearer06} and a discussion on the mailing list \name{plug@lists.q-linux.com} \cite{plug:mtas}. Tabular overviews may be found at \citeweb{mailsoftware42}, \citeweb{wikipedia:comparison-of-mail-servers}, and \cite[section 1.9]{lifewithqmail}.
meillo@89 189
meillo@312 190 Here provided is an overview important properties of the four previously introduced \MTA{}s. The data comes from the above stated sources and is collected in table \ref{tab:mta-comparison}\footnote{The lines of code were messured with \person{David~A.\ Wheeler}'s \name{sloccount} \citeweb{sloccount}.}.
meillo@126 191
meillo@117 192 \begin{table}
meillo@217 193 % FIXME: improve table data!!!
meillo@126 194 \begin{center}
meillo@271 195 \input{tbl/mta-comparison.tbl}
meillo@126 196 \end{center}
meillo@312 197 \caption{Comparison of \MTA{}s}
meillo@126 198 \label{tab:mta-comparison}
meillo@117 199 \end{table}
meillo@89 200
meillo@89 201
meillo@201 202 \subsubsection*{Architecture}
meillo@89 203
meillo@132 204 Architecture is most important when comparing \MTA{}s. Many other properties of a program depend on its architecture. %fixme: add ref?
meillo@248 205 \person{Munawar Hafiz} \cite{hafiz05} discusses in detail on \mta\ architecture, comparing \sendmail, \qmail, \postfix, and \name{sendmail X}. \person{Jonathan de Boyne Pollard}'s \MTA\ review \cite{jdebp} is a source too.
meillo@89 206
meillo@132 207 Two different architecture types show off: monolithic and modular \mta{}s.
meillo@130 208
meillo@217 209 Monolithic \MTA{}s are \sendmail, \name{smail}, \exim, and \masqmail. They all consist of one single \emph{setuid root}\footnote{\emph{setuid root} lets a program run with the rights of its owner, here root. This is considered to be a security risk often. Thus it it should be avoided if possible.} binary which does all the work.
meillo@130 210
meillo@217 211 Modular \MTA{}s are \NAME{MMDF}, \qmail, \postfix, and \name{MeTA1}. They consist of several programs, each doing a part of the overall job. The different programs run with the least permissions the need, and \emph{setuid root} can be avoided.
meillo@130 212
meillo@248 213 The architecture does not directly define the program's security, but ``[t]he goal of making a software secure can be better achieved by making the design simple and easier to understand and verify''\cite[chapter 6]{hafiz05}. \exim, though being monolithic, has a fairly clean security record. But it is very hard to keep the security up, as the program growth. \person{Wietse Venema} (the author of \postfix) says, it was the architecture that enabled \postfix\ to grow without running into security problems. \cite[page 13]{venema:postfix-growth}
meillo@130 214
meillo@217 215 The modular design, with each sub-program doing one part of the overall job, conforms to the \name{Unix Philosophy}. The Unix Philosophy \cite{gancarz95} demands ``small is beautiful'' and ``make each program do one thing well''. Monolithic \MTA{}s fail here.
meillo@130 216
meillo@132 217 Today modular \mta\ architectures are the state-of-the-art.
meillo@89 218
meillo@89 219
meillo@217 220 \subsubsection*{Spam checking and content processing}
meillo@89 221
meillo@217 222 << FIXME >> % fixme
meillo@89 223
meillo@89 224
meillo@217 225 \subsubsection*{Future requirements}
meillo@89 226
meillo@217 227 In chapter \ref{chap:market-analysis}, it was tried to figure out trends and future requirements for \MTA{}s. The four programs are compared on these (possible) future requirements now.
meillo@126 228
meillo@225 229 The first trend was provider independence, requiring easy configuration. \postfix\ seems to do best here. It used primary two configuration files (\path{master.cf} and \path{main.cf}) which are easy to manage. \sendmail\ appears to have a bad position. Its configuration file \path{sendmail.cf} is cryptic and very complex (it has legendary Turing-completeness) thus it needs simplification wrappers around it to provide easier configuration. There exist the \name{m4} macros to generate \path{sendmail.cf}, but adjusting the generated result by hand appears to be necessary for non-trivial configurations. \qmail's configuration files are simple, but the whole system is complex to set up; it requires various system users and is hardly usable without applying several patches to add basic functionality. \name{netqmail} is the community effort to help here. \exim\ has only one single configuration file (\path{exim.conf}), but it suffers most from its flexibility---like \sendmail. Flexibility and easy configuration are almost always contrary goals.
meillo@217 230
meillo@225 231 As second trend, the decreasing necessity for high performance was identified. This goes along with the move of \MTA{}s from service providers to home servers. \postfix\ focuses much on performance, this might not be an important point then. Of course there still will be the need for high performance \MTA{}s, but a growing share of the market will not require high performance. Performance is related to simplicity, which effects security. Increasing performance does in most times decrease the other two. Simple \mta{}s not aiming for highest performance are what is needed in future. The simple of \qmail, still being fast, seems to be a good example.
meillo@217 232
meillo@217 233 The third trend---even more security awareness---is addressed by each of the four programs. It seems as if all widely used \mta{}s provide good security nowadays. Even \sendmail\ can be considered secure today. %fixme:ref
meillo@217 234 But the modular architecture, used by \qmail\ and \postfix, is generally seen to be conceptually more secure, however. %fixme: ref
meillo@132 235 \sendmail's creators have started \name{MeTA1}, a modular \MTA\ merging the best of \qmail\ and \postfix, to replace the old \sendmail. It will be interesting to watch \exim's future---will it become modular too?
meillo@126 236
meillo@126 237
meillo@93 238
meillo@265 239
meillo@265 240
meillo@265 241
meillo@287 242 \section{Summary}
meillo@193 243
meillo@276 244 FIXME %fixme
meillo@276 245
meillo@193 246 %fixme: write a result here
meillo@89 247
meillo@89 248
meillo@117 249
meillo@117 250
meillo@132 251 %todo: my own poll (?)
meillo@117 252
meillo@117 253
meillo@132 254 %<< complexity >> << security >> << simplicity of configuration and administration >> << flexibility of configuration and administration >> << code size >> << code quality >> << documentation (amount and quality) >> << community (amount and quality) >> << used it myself >> << had problems with it >>
meillo@117 255
meillo@117 256
meillo@132 257 %<< quality criteria >> << standards of any kind >> << how to compare? >> << (bewertungsmatrix) objectivity >> << how many criteria for ``good''? >>
meillo@133 258