docs/diploma

annotate thesis/tex/1-Introduction.tex @ 274:56cc2f5755f8

cleardoublepage -> clearpage (as the document is single sided)
author meillo@marmaro.de
date Thu, 15 Jan 2009 12:35:19 +0100
parents ea538a366b7d
children 003410b64739
rev   line source
meillo@26 1 \chapter{Introduction}
meillo@42 2 \label{chap:introduction}
meillo@26 3
meillo@96 4 << say what you want to say >>
meillo@92 5
meillo@102 6 << the overall goal of the document >>
meillo@92 7
meillo@92 8
meillo@92 9
meillo@229 10
meillo@229 11
meillo@245 12
meillo@245 13 \section{Email prerequisites}
meillo@245 14
meillo@245 15 email and everything is defined in RFCs
meillo@245 16
meillo@245 17
meillo@245 18 \subsubsection{Mail agents}
meillo@245 19
meillo@260 20 This thesis will frequently use the three terms: \MTA, \NAME{MUA}, and \NAME{MDA}. The name the three different kinds of software that are the nodes of the email infrastructure. Here they are explained with references to the snail mail system which is known from everyday's life. Figure \ref{fig:mail-agents} shows the relation between those three mail agents and the way an email message takes trough the system.
meillo@253 21
meillo@269 22 \begin{description}
meillo@269 23 \item[\MTA:]
meillo@260 24 \name{Mail Tranfer Agents} are the post offices for electronic mail. The basic job of an \MTA\ is to transport mail from senders to recipients, or more pedantic: from \MTA\ to \MTA. \sendmail, \exim, \qmail, \postfix, and of course \masqmail\ are \MTA{}s. \MTA{}s are explained in more detail in chapter \ref{chap:mail-transfer-agents}.
meillo@245 25
meillo@269 26 \item[\NAME{MUA}:]
meillo@260 27 \name{Mail User Agents} are the software the user deals with. He writes and reads email with it. The \NAME{MUA} passes outgoing mail to the nearest \MTA. Also the \NAME{MUA} displays the contents of the user's mailbox. Well known \NAME{MUA}s are \name{Mozilla Thunderbird} and \name{mutt} on \unix\ systems, and \name{Microsoft Outlook} on \name{Windows}.
meillo@245 28
meillo@269 29 \item[\NAME{MDA}:]
meillo@253 30 \name{Mail Delivery Agents} correspond to postmen in the real world. They receive mail, destinated to recipients they are responsible for, from an \MTA, and deliver it to the mailboxes of those recipients. Many \MTA{}s include an own \NAME{MDA}, but specialized ones exist: \name{procmail} and \name{maildrop} are examples.
meillo@269 31 \end{description}
meillo@245 32
meillo@253 33 \begin{figure}
meillo@253 34 \begin{center}
meillo@253 35 \includegraphics[scale=0.75]{img/mail-agents.eps}
meillo@253 36 \end{center}
meillo@253 37 \caption{Mail agents and the way a mail message takes}
meillo@253 38 \label{fig:mail-agents}
meillo@253 39 \end{figure}
meillo@245 40
meillo@229 41
meillo@253 42
meillo@229 43
meillo@229 44
meillo@229 45
meillo@245 46 \subsubsection{Mail transfer with SMTP}
meillo@245 47
meillo@245 48 Today most of the email is transfered using the \name{Simple Mail Transfer Protocol} (short: \SMTP), which is defined in \RFC821 and the successors \RFC2821 and \RFC5321. A good entry point for further information is \citeweb{wikipedia:smtp}.
meillo@245 49
meillo@245 50 A selection of important concepts of \SMTP\ is explained here.
meillo@245 51
meillo@253 52 First the \name{store and forward} transfer concept. This means mail messages are sent from \MTA\ to \MTA, until the final \MTA\ (the one which is responsible for the recipient) is reached. The message is gets stored for some time on each \MTA, until it is forwarded to the next \MTA.
meillo@245 53
meillo@253 54 This leads to the concept of \name{responsibility}. A mail message is always in the responsibility of one system. First it is the \NAME{MUA}. After it was transfered to the first \MTA, it takes the responsibility for the message over. The \NAME{MUA} can then delete its copy of the message. This is the same for each transfer, from \MTA\ to \MTA\ and finally from \MTA\ to the \NAME{MDA}, the message gets transfered and if the transfer was successful, the responsibility for the message is transfered as well. The responsibility chain ends at a user's mailbox, where he himself has control on the message.
meillo@245 55
meillo@253 56 A third concept is about failure handling. At any step on the way, an \MTA\ may receive a message it is unable to handle. In such a case, this receiving \MTA\ will \name{reject} the message before it takes responsibility for it. The sending \MTA\ still has responsibility for the message and may try other ways for sending the message. If none succeeds, the \MTA\ will send a \name{bounce message} back to the original sender with information on the type of failure. Bounces are only sent if the failure is expected to be permanent, or if the transfer still was unsuccessful after many tries.
meillo@245 57
meillo@245 58
meillo@245 59
meillo@245 60 \subsubsection{Mail messages}
meillo@245 61
meillo@253 62 Mail messages consist of two parts with defined format. This format is specified in \RFC822, and the successors \RFC2822 and \RFC5322.
meillo@245 63
meillo@253 64 The two parts of a message are the \name{header} and the \name{body}. The header of an email message is similar to the header of a (formal) letter. It spans the first lines of the message up to the first empty line. The header consists of several lines, called \name{header lines} or simply \name{headers}. They specify the sender, the address(es) of the recipient(s), the date, and possibly further information. Their order is irrelevant. Headers are named after the colon separated start of those lines, for example the ``\texttt{Date:}'' header. A user may write the header himself, but normally the \NAME{MUA} does this job.
meillo@245 65
meillo@253 66 The body is the payload of the message. It is under full control of the user. From the view point of the \SMTP\ protocol, it must consist of only 7-bit \NAME{ASCII} text. But arbitrary content can be included by encoding it to 7-bit \NAME{ASCII}. \NAME{MIME} is the common \SMTP\ extension to handle such convertion automatically in \NAME{MUA}s.
meillo@245 67
meillo@253 68 Following is a sample mail message with four header lines (\texttt{From:}, \texttt{To:}, \texttt{Date:}, and \texttt{Subject:}) and three lines of message body.
meillo@245 69
meillo@269 70 \codeinput{input/sample-email.txt}
meillo@245 71
meillo@260 72 Email messages are put into envelopes for transfer. This concept is derived from the real world, so it is easy to understand. The envelope is used to route the message from sender to recipient. It contains the sender's address and addresses of one or more recipients. Envelopes are generated by \MTA{}s, usually by using mail header data. The user has not to deal with them.
meillo@253 73
meillo@260 74 Each \MTA\ on the way reads envelopes it receives and generates new ones. If a message has recipients on different hosts, then the message gets copied and sent within multiple envelopes, one for each host.
meillo@260 75
meillo@260 76 The sample message would would lead to two envelopes, one from \name{markus@host01} to \name{alice@host02}, the other from \name{markus@host01} to \name{bob@host03}. Both envelopes would contain the same message.
meillo@245 77
meillo@245 78
meillo@245 79
meillo@245 80
meillo@229 81
meillo@229 82
meillo@92 83 \section{The \masqmail\ project}
meillo@102 84 \label{sec:masqmail}
meillo@96 85
meillo@260 86 The \masqmail\ project was by \person{Oliver Kurth} in 1999. His aim was to create a small \MTA\ that is especially focused on computers with dial-up Internet connections. Throughout the next four years, he worked steadily on it, releasing new versions every few weeks. In total it were 53 releases, which is in average a new version every 20 days.
meillo@96 87
meillo@260 88 This thesis bases on the latest release of \masqmail---version 0.2.21 from November 2005. It was released after a 28 month gap. The source code of 0.2.21 is the same as of 0.2.20, only build documents were modified. The release tarball can be retrieved from the \debian\ package pool\footnote{The \NAME{URL} is: \url{http://ftp.de.debian.org/debian/pool/main/m/masqmail/masqmail\_0.2.21.orig.tar.gz}\,.} \citeweb{debian:packages}. Probably was only put into public in the \debian\ pool because \masqmail's homepage \citeweb{masqmail:homepage2} does not include it.
meillo@96 89
meillo@257 90 \masqmail\ is covered by the \name{General Public License} (short: \GPL), which qualifies it as \freesw.
meillo@102 91
meillo@257 92 \person{Kurth} abandonned \masqmail\ after 2005, and no one addopted the project since then. Thus, the author of this thesis decided to take responsibility for \masqmail\ now. He received \person{Kurth}'s permission to do so.
meillo@102 93
meillo@260 94 The program's new homepage \citeweb{masqmail:homepage} is a collection of available information about this \MTA.
meillo@102 95
meillo@102 96
meillo@96 97
meillo@92 98
meillo@257 99 \subsection{Target field of \masqmail}
meillo@266 100 \label{sec:masqmail-target-field}
meillo@245 101
meillo@257 102 The intention \person{Kurth} had when creating \masqmail\ is best told in his own words:
meillo@92 103 \begin{quote}
meillo@92 104 MasqMail is a mail server designed for hosts that do not have a permanent internet connection eg. a home network or a single host at home. It has special support for connections to different ISPs. It replaces sendmail or other MTAs such as qmail or exim.
meillo@257 105 \hfill\citeweb{masqmail:homepage2}
meillo@257 106 \end{quote}
meillo@257 107 It is inteded to cover a specific niche: non-permanent internet connection and different \NAME{ISP}s.
meillo@257 108
meillo@257 109 Although it can basically replace other \MTA{}s, it is not \emph{generally} aimed to do so. The package description of \debian\ states this more clearly by changing the last sentence to:
meillo@257 110 \begin{quote}
meillo@257 111 In these cases, MasqMail is a slim replacement for full-blown MTAs such as sendmail, exim, qmail or postfix.
meillo@257 112 \hfill\citeweb{packages.debian:masqmail}
meillo@257 113 \end{quote}
meillo@257 114 The program is a good replacement ``in these cases'', but not generally, since is lacks essential features for running on mail servers. It is primarily not secure enough for being accessable from untrusted locations.
meillo@257 115
meillo@257 116 \masqmail\ is best used in home networks, which are non-permanently connected to the Internet. It is easy configurable for situations which are rarely solveable with the common \MTA{}s. Such include different handling of mail to local or remote destination and respecting different routes of being online connection. These features are explained in more detail in the following \name{Features} section on page \ref{sec:masqmail-features}. %fixme: is it still called ``features''?
meillo@257 117
meillo@257 118 While many other \MTA{}s are general purpose \MTA{}s, \masqmail\ aims on special situations. Nevertheless, it can be used as general purpose \MTA, too. Especially this was a design goal of \masqmail: To be a replacement for \sendmail, or similar well known \MTA{}s.
meillo@257 119
meillo@257 120 \masqmail\ is designed to run on workstations and on servers in small networks, like home networks.
meillo@257 121
meillo@257 122
meillo@257 123
meillo@260 124 \subsubsection*{Typical usage scenarios}
meillo@257 125
meillo@269 126 This section describes three common setups that makes senseful use of \masqmail. The first two are shown in figure \ref{fig:masqmail-typical-usage}.
meillo@257 127
meillo@257 128 \begin{figure}
meillo@257 129 \begin{center}
meillo@257 130 \includegraphics[scale=0.75]{img/masqmail-typical-usage.eps}
meillo@257 131 \end{center}
meillo@257 132 \caption{Typical usage scenarios for \masqmail}
meillo@257 133 \label{fig:masqmail-typical-usage}
meillo@257 134 \end{figure}
meillo@257 135
meillo@269 136 Imagine a home network consisting of some workstations which is connected to the Internet.
meillo@260 137
meillo@269 138 \begin{description}
meillo@269 139 \item[Scenario 1:]
meillo@269 140 If no server is present, every workstation would be equiped with \masqmail. Mail transfer within the same machine or within the local net works straight forward using direkt transfer. Outgoing mail to the internet is sent, to an \name{Internet Service Provider} (short: \NAME{ISP}) for relaying whenever the router goes online. The configuration of \masqmail\ would be the same on every computer, except different hostnames.
meillo@269 141 To receive mail from the Internet requires a mailbox on the \NAME{ISP}'s mail server. Mail needs to be fetched from the \NAME{ISP}'s server onto the workstation using the \NAME{POP3} or \NAME{IMAP} protocol.
meillo@269 142
meillo@269 143 \item[Scenario 2:]
meillo@269 144 In the same network but with a server, one could have \masqmail\ running on the server and using simple forwarders (see \ref{subsec:relay-only}) on the workstations to tranfer mail to the server. The server would then, dependent on the desination of the message, deliver locally or relay to an \NAME{ISP}'s server for further relay. This setup does only support mail transfer to the server, but not back to a workstation. However, it can be solved by mounting the users mailbox from the server to the workstation, or by using the \NAME{POP3} or \NAME{IMAP} protocol to fetch the mail in the server's mailbox from the workstations. Mail transfer from the \NAME{ISP} to the local server needs \NAME{POP3} or \NAME{IMAP} as well.
meillo@269 145
meillo@269 146 \item[Scenario 3:]
meillo@269 147 A third scenario is unrelated as it is about notebooks. Notebooks are usually used as mobile workstations. One uses them to work at different locations. With the increasing popularity of wireless networks this gets more and more common. Different networks have different setups: In one network it is best to send mail to an \NAME{ISP} for relay. In another network it might be prefered to use a local mail server. A third network may have no Internet access at all, hence using a local mail server is required. All these different setups can be configured once and then used by simply telling the online state to \masqmail, even automatically within a network setup script.
meillo@269 148 \end{description}
meillo@269 149
meillo@269 150
meillo@269 151
meillo@269 152 In general, all kinds of usage scenarios within a trusted network are possible. Important to notice is that mail can not be send from outside into the trusted network then. For using \masqmail\ on notebooks it is suggested to only accept mail from local users, because notebooks are often in untrusted environments. This limitation leads to the next section.
meillo@257 153
meillo@257 154
meillo@257 155
meillo@257 156
meillo@257 157 \subsubsection*{Limitations}
meillo@257 158
meillo@260 159 Although \masqmail\ is seen as a replacement for other general purpose \MTA{}s, it should not be used on large mail servers. The reasons are that it implements only a basic subset of features, and that its performance and security are not as needed for such usage.
meillo@257 160
meillo@260 161 The author, \person{Kurth}, warns on the old project's website about using \masqmail\ to accept connections from the Internet, because of the risk of being an open relay:
meillo@257 162
meillo@257 163 \begin{quote}
meillo@257 164 MasqMail is not designed to run on a host with a permanent internet connection. It does not have the ability to check for spam mail and it will relay everything from everywhere to everywhere. Use another mail server such as exim for permanent connections.
meillo@257 165 \hfill\citeweb{masqmail:homepage2}
meillo@92 166 \end{quote}
meillo@92 167
meillo@269 168 The actual problem is not the permanent Internet connection, but listening for incomming mail on it. If a firewall is closed for incoming mail, then the permanent Internet connection is no problem. \masqmail\ should not be used for permanent internet connections. Or at least it needs to be secured with care.
meillo@160 169
meillo@269 170 The Internet is the common example for an untrusted network, but this applies to any other untrusted network too.
meillo@160 171
meillo@160 172
meillo@160 173
meillo@245 174
meillo@245 175
meillo@245 176
meillo@245 177
meillo@245 178
meillo@245 179
meillo@245 180
meillo@245 181
meillo@245 182
meillo@245 183 \subsection{Features}
meillo@238 184
meillo@248 185 Here regarded is version 0.2.21 of \masqmail. This is the last version released by \person{Oliver Kurth}, and the basis for my thesis.
meillo@238 186
meillo@238 187
meillo@238 188 \subsubsection*{The source code}
meillo@238 189
meillo@238 190 \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library has higher precedence in linking, though.
meillo@238 191
meillo@238 192 The only mandatory dependency is \name{glib}---a cross-platform software utility library, originated in the \NAME{GTK+} project. It provides safe replacements for many standard library functions, especially for the string functions. It also offers handy data containers, easy-to-use implementations of data structures, and much more.
meillo@238 193
meillo@260 194 Some functionality of \masqmail\ can be included or excluded at compile time by defining symbols. To enable maildir support for example, one has to add \verb_--enable-maildir_ to the configure call. Otherwise the concerning code gets removed during preprocessing.
meillo@260 195
meillo@260 196 With \masqmail\ comes the small tool \path{mservdetect}; it helps setting up a configuration that uses the \name{mserver} system to detect the online state. Two other binaries get compiled for testing purposes: \path{readtest} and \path{smtpsend}. All three programs use parts of \masqmail's source code; they only add a file with a \verb+main()+ function each.
meillo@238 197
meillo@238 198
meillo@238 199
meillo@238 200 \subsubsection*{Features}
meillo@238 201 \label{sec:masqmail-features}
meillo@238 202
meillo@260 203 \masqmail\ supports two channels for incoming mail: (1) Standard input, used when \path{masqmail} is executed on the command line and (2) a \NAME{TCP} socket, used by local or remote clients that talk \SMTP. The outgoing channels for mail are: (1) direct delivery to local mailboxes (in \name{mbox} or \name{maildir} format), (2) local pipes to pass mail to a program (e.g.\ gateways to \NAME{UUCP}, gateways to fax, or \NAME{MDA}s), and (3) \NAME{TCP} sockets to transfer mail to other \MTA{}s using the \SMTP\ protocol. Figure \ref{fig:masqmail-channels} shows this as a picture. (The ``online state'' input is explained a bit later.)
meillo@260 204
meillo@260 205 \begin{figure}
meillo@260 206 \begin{center}
meillo@260 207 \includegraphics[scale=0.75]{img/masqmail-channels.eps}
meillo@260 208 \end{center}
meillo@260 209 \caption{Incoming and outgoing channels of \masqmail}
meillo@260 210 \label{fig:masqmail-channels}
meillo@260 211 \end{figure}
meillo@238 212
meillo@238 213 Outgoing \SMTP\ connections feature \SMTP-\NAME{AUTH} and \SMTP-after-\NAME{POP} authentication, but incoming connections do not. Using wrappers for outgoing connections is supported. This allows encrypted communication through a gateway application like \name{openssl}.
meillo@238 214
meillo@238 215 Mail queuing and alias expansion is both supported.
meillo@238 216
meillo@260 217 The \masqmail\ executable can be called under various names for sendmail-compatibility reasons (see section \ref{sec:sendmail-compat}). This is organized by symbolic links with different names pointing to the \masqmail\ executable. The \sendmail\ names are \path{/usr/lib/sendmail} and \path{/usr/sbin/sendmail} because many programs expect the \mta\ to be located there. Further more \sendmail\ supports calling it with a different name instead of supplying command line arguments. The best known of this shortcuts is \path{mailq}, which is equivalent to calling it with the argument \verb+-bq+. \masqmail\ recognizes the shortcuts \path{mailq}, \path{smtpd}, \path{mailrm}, \path{runq}, \path{rmail}, and \path{in.smtpd}. The first two are inspired by \sendmail. Not implemented is the shortcut \path{newaliases} because \masqmail\ does not generate binary representations of the alias file.\footnote{A shell script named \path{newaliases}, that invokes \texttt{masqmail -bi}, can provide the command to satisfy other software needing it.} \path{hoststat} and \path{purgestat} are missing for complete sendmail-compatibility.
meillo@238 218 %masqmail: mailq, mailrm, runq, rmail, smtpd/in.smtpd
meillo@238 219 %sendmail: hoststat, mailq, newaliases, purgestat, smtpd
meillo@238 220
meillo@238 221 Additional to the \mta\ job, \masqmail\ also offers mail retrieval services by being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online connection.
meillo@238 222
meillo@238 223
meillo@238 224
meillo@245 225 \subsubsection*{Online detection and routes}
meillo@245 226 \label{sec:masqmail-routes}
meillo@238 227
meillo@260 228 \masqmail\ focuses on non-permanent online connections, thus a concept of online routes is used. One may configure any number of routes to send mail. Each route can have criteria to determine if some message is allowed to be sent over it. This concept is explained in section \ref{sec:masqmail-routes} in detail. Mail to destinations outside the local network gets queued until an online connections is available.
meillo@260 229
meillo@257 230 \masqmail\ queues mail for destinations outside the local network if no connection to the internet is online. If the machine goes online, this mail is sent. Mail to local machines is sent immediately.
meillo@257 231
meillo@257 232 \masqmail\ sends mail to local destinations, like users on the same machine and on other machines in the local net, immediately. Email to recipients outside the local net are queued when offline and sent when a online connection gets established.
meillo@257 233
meillo@257 234 Further more does \masqmail\ respect online connections through different \NAME{ISP}s; a common thing for dial-up connections. In particular can different sender addresses be set, dependent on the \NAME{ISP} that is used. This prevents mail to be likely classified as spam.
meillo@245 235 ---
meillo@238 236
meillo@245 237 As \masqmail\ is focused on non-permanent Internet connections, online state can be queried by three methods: reading from a file, reading the output of a command, or by asking an \name{mserver}. Each method may return a string indicating one of the available routes being online, or returning nothing to indicate offline state.
meillo@238 238
meillo@245 239 Delivery to recipients on the local host or in local nets is done at once; delivery to recipients on the Internet is only done when being online, and queued otherwise. Each online route may have a different mail server to which mail is relayed. Return address headers are modified appropriate if wished.
meillo@238 240
meillo@245 241 ---
meillo@238 242
meillo@245 243 \masqmail\ focuses on non-permanent online connections, thus a concept of online routes is used. One may configure any number of routes to send mail. Each route can have criteria, like matching \texttt{From:} or \texttt{To:} headers, to determine if some message is allowed to be sent over it. Mail to destinations outside the local network gets queued until an online connections is available.
meillo@238 244
meillo@238 245
meillo@245 246
meillo@245 247
meillo@245 248
meillo@245 249
meillo@245 250
meillo@245 251
meillo@245 252
meillo@245 253
meillo@245 254 \section{Why \masqmail?}
meillo@92 255
meillo@92 256 As main advantage, \masqmail\ makes it easy to set up an \MTA\ on workstations or notebooks without the need to do complex configuration or to be an mail server expert.
meillo@92 257
meillo@92 258 Workstations use %FIXME
meillo@92 259
meillo@96 260 \textbf{Alternatives?}
meillo@245 261 http://anfi.homeunix.org/sendmail/dialup10.html
meillo@92 262
meillo@92 263
meillo@245 264 << explain why masqmail is old and why it is interesting/important however! >>
meillo@96 265
meillo@175 266 << why is it worth to revive masqmail? >>
meillo@175 267
meillo@96 268
meillo@96 269
meillo@245 270
meillo@245 271
meillo@245 272
meillo@245 273
meillo@92 274 \section{Problems to solve}
meillo@92 275
meillo@245 276 << what problems has masqmail? >>
meillo@96 277
meillo@245 278 << what's the intention of this document? >>
meillo@96 279
meillo@245 280 << why is it worth the effort? >>
meillo@96 281
meillo@96 282
meillo@245 283
meillo@245 284
meillo@245 285
meillo@245 286 \section{Delimitation}
meillo@96 287
meillo@150 288 << limit against stuff not covered here >>
meillo@96 289
meillo@260 290 pop3 stuff of masqmail is not regarded.
meillo@96 291
meillo@96 292
meillo@150 293