docs/diploma

annotate thesis/tex/1-Introduction.tex @ 250:9f06b3a1388f

new diagram mail-agents
author meillo@marmaro.de
date Mon, 12 Jan 2009 12:51:03 +0100
parents da83360f8442
children 4dacd0d50342
rev   line source
meillo@26 1 \chapter{Introduction}
meillo@42 2 \label{chap:introduction}
meillo@26 3
meillo@96 4 << say what you want to say >>
meillo@92 5
meillo@102 6 << the overall goal of the document >>
meillo@92 7
meillo@92 8
meillo@92 9
meillo@229 10
meillo@229 11
meillo@245 12
meillo@245 13 \section{Email prerequisites}
meillo@245 14
meillo@245 15 email and everything is defined in RFCs
meillo@245 16
meillo@245 17
meillo@245 18 \subsubsection{Mail agents}
meillo@245 19
meillo@245 20 \paragraph{MTA}
meillo@245 21 \name{Mail Tranfer Agents} are for electronic mail what post offices are for snail mail. The basic job is to transport mail from senders to recipients, or more pedantic: from \MTA\ to \MTA. This is the definition of such kind of software, and this is how \MTA{}s are generally seen \cite[page 19]{dent04} \cite[pages 3-5]{hafiz05}. \MTA{}s are explained in more detail in chapter \ref{chap:mail-transfer-agents}.
meillo@245 22
meillo@245 23
meillo@245 24 \paragraph{MUA}
meillo@245 25 \name{Mail User Agents} the software the user deals with. It is the program he uses to write and read email. The \NAME{MUA} passes outgoing mail to the next \MTA, and it displays the contents of the user's mailbox. Well known \NAME{MUA}s are \name{Mozilla Thunderbird} and \name{mutt} on \unix\ systems, and \name{Microsoft Outlook} on \name{Windows}.
meillo@245 26
meillo@245 27
meillo@245 28 \paragraph{MDA}
meillo@245 29 \name{Mail Delivery Agents} correspond to postmen in the real world. They receive mail, to recipients they are responsible for, from an \MTA, and deliver it to the mailboxes of the recipients. Many \MTA{}s include an own \NAME{MDA}, but specialized ones exist: \name{procmail} and \name{maildrop} are examples.
meillo@245 30
meillo@245 31
meillo@229 32
meillo@229 33 << structure diagram of an MTA (and of masqmail) >>
meillo@229 34
meillo@229 35
meillo@229 36
meillo@245 37 \subsubsection{Mail transfer with SMTP}
meillo@245 38
meillo@245 39 Today most of the email is transfered using the \name{Simple Mail Transfer Protocol} (short: \SMTP), which is defined in \RFC821 and the successors \RFC2821 and \RFC5321. A good entry point for further information is \citeweb{wikipedia:smtp}.
meillo@245 40
meillo@245 41 A selection of important concepts of \SMTP\ is explained here.
meillo@245 42
meillo@245 43 First the \name{store and forward} transfer method. This means mail messages are sent from \MTA\ to \MTA\ until the final \MTA\ (the one which is responsible for the recipient) is reached. The message is gets stored for some time on each \MTA, until it is forwarded to the next \MTA.
meillo@245 44
meillo@245 45 This leads to the concept of \name{responsibility}. A mail message is always in the responsibility of one system. First it is the \NAME{MUA}. After it was transfered to the first \MTA, he takes the responsibility for the message over. The \NAME{MUA} can then delete its copy of the message. This is the same for each transfer, from \MTA\ to \MTA\ and finally from \MTA\ to the \NAME{MDA}, the message gets transfered and if this was successful, the responsibility for the message is transfered as well. The responsibility chain ends at the user's mailbox, where he himself has control on the message again.
meillo@245 46
meillo@245 47 A third concept is about failure handling. At any step on the way, an \MTA\ may get a message he is unable to handle. In such a case, this receiving \MTA\ will \name{reject} the message before it takes responsibility for it. The sending \MTA\ still has responsibility for the message and may try other ways of sending the message. If none succeeds, the \MTA\ will send a \name{bounce message} back to the original sender with information on the type of failure. Bounces are only sent if the failure is expected to be permanent, and if after many tries the transfer still was not successful.
meillo@245 48
meillo@245 49
meillo@245 50
meillo@245 51 \subsubsection{Mail messages}
meillo@245 52
meillo@245 53 Mail messages consist of three parts that with defined format. It is defined in \RFC822, and the successors \RFC2822 and \RFC5322.
meillo@245 54
meillo@245 55 A message consists of \name{envelope} and \name{content}. This concept is derived from the real world, so it is easy to understand. The envelope is what is used to route the message from sender to recipient. It contains the sender's address and addresses of one or more recipients. Envelopes are generated (using mail header data) by \MTA{}s, the user has not to deal with them.
meillo@245 56
meillo@245 57 The content of the message is again split into two part: The \name{header} and the \name{body}. The header of an email message is similar to the header of a (formal) letter. It spans the first lines of the content up to the first empty line. The header consists of several lines, called \name{header lines} or simply \name{headers}. They specify the sender, the address(es) of the recipient(s), the date, and possibly further information. Their order is irrelevant. Headers are named after the colon separated start of those lines, for example the \texttt{Date:} header. This header can write the header himself, but normally the \NAME{MUA} does this job.
meillo@245 58
meillo@245 59 Finally the body is the payload of the message. It is under full control of the user. From the view point of the \SMTP\ protocol, only 7-bit \NAME{ASCII} is allowed to in it, but arbitrary content can be included by encoding it to 7-bit \NAME{ASCII}. \NAME{MIME} is the common \SMTP\ extension to handle such convertion automatically in \NAME{MUA}s.
meillo@245 60
meillo@245 61 Following is a sample mail message.
meillo@245 62
meillo@245 63 \input{input/sample-email.txt}
meillo@245 64
meillo@245 65
meillo@245 66
meillo@245 67
meillo@245 68
meillo@229 69
meillo@229 70
meillo@92 71 \section{The \masqmail\ project}
meillo@102 72 \label{sec:masqmail}
meillo@96 73
meillo@96 74 << about masqmail (some history) >>
meillo@96 75
meillo@96 76 (include history of email, definition of MTA and sendmail-compatibility in text)
meillo@96 77
meillo@248 78 The \masqmail\ program was written by \person{Oliver Kurth}, starting in 1999. His aim was to create a small \mta\ which is especially focused on computers with dial-up connections to the internet. \masqmail\ is easy configurable for situations which are rarely solveable with the common \MTA{}s.
meillo@102 79
meillo@102 80 \masqmail\ queues mail for destinations outside the local network if no connection to the internet is online. If the machine goes online, this mail is sent. Mail to local machines is sent immediately.
meillo@102 81
meillo@102 82 While the other \MTA{}s are more general purpose \MTA{}s, \masqmail\ aims on special situations only. Nevertheless can it handle ordinary mail transfers too.
meillo@102 83
meillo@102 84 \masqmail\ is released under the \GPL, which makes it \freesw. The latest stable version is 0.2.21 from November 2005.
meillo@102 85
meillo@102 86 The program's new homepage \citeweb{masqmail:homepage} provides further information about this \MTA.
meillo@96 87
meillo@245 88 << specify the really important external documents here >> %FIXME
meillo@92 89
meillo@245 90
meillo@245 91
meillo@245 92 \subsection{Target field / When to use \masqmail}
meillo@160 93
meillo@248 94 Its original author, \person{Oliver Kurth}, sees \masqmail\ so:
meillo@92 95 \begin{quote}
meillo@92 96 MasqMail is a mail server designed for hosts that do not have a permanent internet connection eg. a home network or a single host at home. It has special support for connections to different ISPs. It replaces sendmail or other MTAs such as qmail or exim.
meillo@92 97 \end{quote}
meillo@92 98
meillo@92 99 \masqmail\ is inteded to cover a specific niche: non-permanent internet connection and different \NAME{ISP}s.
meillo@92 100
meillo@92 101 Although it can basically replace other \MTA{}s, it is not generally aimed to do so. The package description of \debian\citeweb{packages.debian:masqmail} states this more clearly by changing the last sentence to:
meillo@92 102 \begin{quote}
meillo@92 103 In these cases, MasqMail is a slim replacement for full-blown MTAs such as sendmail, exim, qmail or postfix.
meillo@92 104 \end{quote}
meillo@92 105 \masqmail\ is a good replacement ``in these cases'', but not generally, since is lacks features essential for running on mail servers. It is primarily not secure enough for being accessable from untrusted locations.
meillo@92 106
meillo@92 107 The program is best used in home networks, which are non-permanently connected to the internet. \masqmail\ sends mail to local destinations, like users on the same machine and on other machines in the local net, immediately. Email to recipients outside the local net are queued when offline and sent when a online connection gets established.
meillo@92 108
meillo@92 109 Further more does \masqmail\ respect online connections through different \NAME{ISP}s; a common thing for dial-up connections. In particular can different sender addresses be set, dependent on the \NAME{ISP} that is used. This prevents mail to be likely classified as spam.
meillo@92 110
meillo@92 111
meillo@92 112
meillo@160 113
meillo@160 114 \subsubsection*{\masqmail's main goal}
meillo@160 115
meillo@160 116 \masqmail\ does have similar requirements, by being a \sendmail\ replacement, which is a basic goal of the project. The main difference is that \masqmail\ is intended to be used on workstations and in small networks, but \sendmail, \qmail, and \postfix\ are designed to run on large mail servers to handle masses of email. The author of \masqmail, \person{Kurth}, in contrast, warns on the old project's website \citeweb{masqmail:homepage2} about using it to accept connections from the Internet, because of the risk to be an open relay:
meillo@160 117 \begin{quote}
meillo@160 118 MasqMail is not designed to run on a host with a permanent internet connection. It does not have the ability to check for spam mail and it will relay everything from everywhere to everywhere. Use another mail server such as exim for permanent connections.
meillo@160 119 \end{quote}
meillo@160 120 Even if some relay control will be added, ``is not designed to'' is a clear indicator for being careful. Issues like high memory consumption, low performance, and denial-of-service attacks---things not regarded by design---may cause serious problems.
meillo@160 121
meillo@160 122 Here shows a misfit off: On the one hand does \masqmail\ want to be a \sendmail\ replacement. But on the other hand, is it not designed to be used like \sendmail. If \masqmail\ is inteded to replace other \MTA{}s, then one may replace another one with it. Hence it must be secure enough. It either needs the security features or must drop the unsecure funtionality. The second option, however, leads to being \emph{no} replacement for other \MTA{}s. It is a valid decision to not be a replacement for \sendmail\ or thelike, but this is a design decision---the change of a primary goal.
meillo@160 123
meillo@160 124 If \masqmail\ should be an \MTA\ to replace others, a switch to a better suited architecture that provides good security and extendability by design, seems required. But if \masqmail\ is wanted to cover some special jobs, not to replace common \MTA{}s, then its architecture depends on the special requirements of the specific job; \MTA\ architectures, like discussed by \person{Hafiz}, may be inadequate.
meillo@160 125
meillo@160 126
meillo@160 127 \subsubsection*{Full featured or stripped down}
meillo@160 128
meillo@160 129 What future is to choose for \masqmail---one to be a full featured \MTA, or one to be a stipped down \MTA\ for special jobs?
meillo@160 130
meillo@160 131 The critical point to discuss upon is surely the listening on a port to accepte messages from outside via \NAME{SMTP} (herafter also refered to as the \NAME{SMTP}-in channel). This feature is required for an \MTA\ to be a \name{smart host}, to relay mail. But running as deamon and listening on a port requires much more security effort, because the program is put in direct contact with attackers and other bad guys.
meillo@160 132
meillo@160 133 \MTA{}s without \SMTP-in channels can not receive mail from arbitrary outside hosts. They are only invoked by local users. This lowers the security need a lot---however, security is a general goal and still required, but on a lower level. Unfortunately, as they do not receive mail anymore (except by local submission), they are just better \name{forwarders} that are able to send mail directly to the destination.
meillo@160 134
meillo@160 135 This is not what \masqmail\ was intended to be. Programs that cover this purpose are available; one is \name{msmtp}.
meillo@160 136
meillo@160 137 \masqmail\ shall be a complete \mta. It shall be able to replace ones like \sendmail.
meillo@160 138
meillo@160 139
meillo@160 140
meillo@245 141 \subsubsection*{Typical usage}
meillo@245 142 This section describes situations that make senseful use of \masqmail.
meillo@160 143
meillo@245 144 A home network consisting of some workstations without a server. The network is connected to the internet by dial-up or broadband. Going online is initiated by computers inside the local net. \NAME{IP} addresses change at least once every day.
meillo@160 145
meillo@245 146 Every workstation would be equiped with \masqmail. Mail transfer within the same machine or within the local net works straight forward. Outgoing mail to the internet is sent, to the concerning \NAME{ISP} for relaying, whenever the router goes online. Receiving of mail from outside needs to be done by a mail fetch program, like the \masqmail\ internal \NAME{POP3} client or \name{fetchmail} for example. The configuration for \masqmail\ would be the same on every computer, except the hostname.
meillo@160 147
meillo@245 148 For the same network but having a server, one could have \masqmail\ running on the server and using simple forwarders (see \ref{subsec:relay-only}) to the server on the workstations. This setup does only support mail transfer to the server, but not back to a workstation; also sending mail to another user on the same workstation is not possible.
meillo@160 149
meillo@245 150 A better setup is to run \masqmail\ on every machine %FIXME
meillo@160 151
meillo@245 152
meillo@245 153
meillo@245 154
meillo@245 155
meillo@245 156
meillo@245 157 \subsection{When not to use \masqmail}
meillo@245 158
meillo@245 159 ...
meillo@245 160
meillo@245 161
meillo@245 162
meillo@245 163
meillo@245 164
meillo@245 165
meillo@245 166
meillo@245 167
meillo@245 168
meillo@245 169
meillo@245 170 \subsection{Features}
meillo@238 171
meillo@248 172 Here regarded is version 0.2.21 of \masqmail. This is the last version released by \person{Oliver Kurth}, and the basis for my thesis.
meillo@238 173
meillo@238 174
meillo@238 175 \subsubsection*{The source code}
meillo@238 176
meillo@238 177 \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library has higher precedence in linking, though.
meillo@238 178
meillo@238 179 The only mandatory dependency is \name{glib}---a cross-platform software utility library, originated in the \NAME{GTK+} project. It provides safe replacements for many standard library functions, especially for the string functions. It also offers handy data containers, easy-to-use implementations of data structures, and much more.
meillo@238 180
meillo@238 181 With \masqmail\ comes the small tool \path{mservdetect}; it helps setting up a configuration that uses the \name{mserver} system to detect the online state. Two other binaries get compiled for testing purposes: \path{readtest} and \path{smtpsend}. All three programs use \masqmail\ source code; they only add a file with a \verb+main()+ function each.
meillo@238 182
meillo@238 183 \masqmail\ lacks an interface to plug in modules with additional functionality. There exists no add-on or module system. The code is only separated by function to the various source files. Some functional parts can be included or excluded by defining symbols at compile time. Adding maildir support, means giving the option \verb+--enable-maildir+ to the \path{configure} call. This preserves the concerning code to get removed by the preprocessor. Unfortunately the \verb+#ifdef+s are scattered through all the source, leading to a code that is hard to read.
meillo@238 184 %fixme: refer to ifdef-considered-harmful ?
meillo@238 185
meillo@238 186
meillo@238 187
meillo@238 188 \subsubsection*{Features}
meillo@238 189 \label{sec:masqmail-features}
meillo@238 190
meillo@238 191 \masqmail\ supports two channels for incoming mail: (1) Standard input, used when \path{masqmail} is executed on the command line and (2) a \NAME{TCP} socket, used by local or remote clients that talk \SMTP. The outgoing channels for mail are: (1) direct delivery to local mailboxes (in \name{mbox} or \name{maildir} format), (2) local pipes to pass mail to a program (e.g.\ gateways to \NAME{UUCP}, gateways to fax, or \NAME{MDA}s), and (3) \NAME{TCP} sockets to transfer mail to other \MTA{}s using the \SMTP\ protocol.
meillo@238 192
meillo@238 193 Outgoing \SMTP\ connections feature \SMTP-\NAME{AUTH} and \SMTP-after-\NAME{POP} authentication, but incoming connections do not. Using wrappers for outgoing connections is supported. This allows encrypted communication through a gateway application like \name{openssl}.
meillo@238 194
meillo@238 195 Mail queuing and alias expansion is both supported.
meillo@238 196
meillo@238 197 \masqmail\ focuses on non-permanent online connections, thus a concept of online routes is used. One may configure any number of routes to send mail. Each route can have criteria to determine if some message is allowed to be sent over it. This concept is explained in section \ref{sec:masqmail-routes} in detail. Mail to destinations outside the local network gets queued until an online connections is available.
meillo@238 198
meillo@238 199 The \masqmail\ executable can be called under various names for sendmail-compatibility reasons. This is organized by symbolic links with different names pointing to the \masqmail\ executable. The \sendmail\ names are \path{/usr/lib/sendmail} and \path{/usr/sbin/sendmail} because many programs expect the \mta\ to be located there. Further more \sendmail\ supports calling it with a different name instead of supplying command line arguments. The best known of this shortcuts is \path{mailq}, which is equivalent to calling it with the argument \verb+-bq+. \masqmail\ recognizes the shortcuts \path{mailq}, \path{smtpd}, \path{mailrm}, \path{runq}, \path{rmail}, and \path{in.smtpd}. The first two are inspired by \sendmail. Not implemented is the shortcut \path{newaliases} because \masqmail\ does not generate binary representations of the alias file.\footnote{A shell script named \path{newaliases}, that invokes \texttt{masqmail -bi}, can provide the command to satisfy other software needing it.} \path{hoststat} and \path{purgestat} are missing for complete sendmail-compatibility.
meillo@238 200 %masqmail: mailq, mailrm, runq, rmail, smtpd/in.smtpd
meillo@238 201 %sendmail: hoststat, mailq, newaliases, purgestat, smtpd
meillo@238 202
meillo@238 203 Additional to the \mta\ job, \masqmail\ also offers mail retrieval services by being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online connection.
meillo@238 204
meillo@238 205
meillo@238 206
meillo@245 207 \subsubsection*{Online detection and routes}
meillo@245 208 \label{sec:masqmail-routes}
meillo@238 209
meillo@245 210 ---
meillo@238 211
meillo@245 212 As \masqmail\ is focused on non-permanent Internet connections, online state can be queried by three methods: reading from a file, reading the output of a command, or by asking an \name{mserver}. Each method may return a string indicating one of the available routes being online, or returning nothing to indicate offline state.
meillo@238 213
meillo@245 214 Delivery to recipients on the local host or in local nets is done at once; delivery to recipients on the Internet is only done when being online, and queued otherwise. Each online route may have a different mail server to which mail is relayed. Return address headers are modified appropriate if wished.
meillo@238 215
meillo@245 216 ---
meillo@238 217
meillo@245 218 \masqmail\ focuses on non-permanent online connections, thus a concept of online routes is used. One may configure any number of routes to send mail. Each route can have criteria, like matching \texttt{From:} or \texttt{To:} headers, to determine if some message is allowed to be sent over it. Mail to destinations outside the local network gets queued until an online connections is available.
meillo@238 219
meillo@238 220
meillo@245 221
meillo@245 222
meillo@245 223
meillo@245 224
meillo@245 225
meillo@245 226
meillo@245 227
meillo@245 228
meillo@245 229 \section{Why \masqmail?}
meillo@92 230
meillo@92 231 As main advantage, \masqmail\ makes it easy to set up an \MTA\ on workstations or notebooks without the need to do complex configuration or to be an mail server expert.
meillo@92 232
meillo@92 233 Workstations use %FIXME
meillo@92 234
meillo@96 235 \textbf{Alternatives?}
meillo@245 236 http://anfi.homeunix.org/sendmail/dialup10.html
meillo@92 237
meillo@92 238
meillo@245 239 << explain why masqmail is old and why it is interesting/important however! >>
meillo@96 240
meillo@175 241 << why is it worth to revive masqmail? >>
meillo@175 242
meillo@96 243
meillo@96 244
meillo@245 245
meillo@245 246
meillo@245 247
meillo@245 248
meillo@92 249 \section{Problems to solve}
meillo@92 250
meillo@245 251 << what problems has masqmail? >>
meillo@96 252
meillo@245 253 << what's the intention of this document? >>
meillo@96 254
meillo@245 255 << why is it worth the effort? >>
meillo@96 256
meillo@96 257
meillo@245 258
meillo@245 259
meillo@245 260
meillo@245 261 \section{Delimitation}
meillo@96 262
meillo@150 263 << limit against stuff not covered here >>
meillo@96 264
meillo@96 265
meillo@96 266
meillo@150 267