# HG changeset patch # User meillo@marmaro.de # Date 1233346800 -3600 # Node ID 80b2e476c2e3e55640cba26526c645272553f546 # Parent ba9463b43709c84e1a4263d9097b5dba431bec84 a lot of cleanup diff -r ba9463b43709 -r 80b2e476c2e3 thesis/tex/0-preface.tex --- a/thesis/tex/0-preface.tex Wed Jan 28 16:49:45 2009 +0100 +++ b/thesis/tex/0-preface.tex Fri Jan 30 21:20:00 2009 +0100 @@ -2,37 +2,25 @@ \chapter*{Preface} \addcontentsline{toc}{section}{Preface} -This thesis is about \masqmail, a small mail transfer agent for workstations and home networks. In October 2007 I chose \masqmail\ for my machines because it is a small but ``real'' mail transfer agent. \masqmail\ served me well since then and I found no reasons to change. +This thesis is about \masqmail, a small mail transfer agent for workstations and home networks. In October 2007 I chose \masqmail\ for my machines because it is a small though ``real'' mail transfer agent. \masqmail\ served me well since then and I found no reasons to change. -Unfortunately, the \masqmail\ package in \debian, which is my preferred \NAME{GNU}/Linux distribution, is unmaintained since the beginning of 2008. Unmaintained packages with critical bugs are likely to get dropped out of a distribution. Although \masqmail\ had no critical bugs, this was a situation I definitely wanted to prevent. +Unfortunately, the \masqmail\ package in \debian, which is my preferred \NAME{GNU}/Linux distribution, is unmaintained since the beginning of 2008. Unmaintained packages are likely to get dropped out of a distribution if critical bugs appear in them. Although \masqmail\ had no critical bugs, this was a situation I definitely wanted to prevent. Using my diploma thesis as a ``power-start'' for maintaining and developing \masqmail\ in the future was a great idea. As it came to my mind I knew this is the thing I \emph{wanted} to do. --- I did it! :-) +\quad + +The overall goal of this document is to revive \masqmail\ in usage and development. \masqmail\ was not developed in the last five years although the world of email changed during this time. Hence quite some work needed to be done. + +I chose to start down at the basis and analyze the environment and \masqmail\ throughout to end in concrete plans of what should be done and how it should be done to turn \masqmail\ into a modern mail transfer agent again. + +The actual implementation of the the proposed changes goes beyond this thesis. Here are solutions identified, described, discussed, and recommended but not implemented. I did work in the code and have fixed bugs during the time I wrote on the thesis, though. \quad -The overall goal of this document is to revive \masqmail\ in usage and development. \masqmail\ was not developed in the last five years, but the world of email changed during this time. Hence quite some work needed to be done. +This document is primary written with an audience of \masqmail\ developers and developers of other mail transfer agents in mind. But users of \masqmail\ and everyone who is interested in email systems in general may find this thesis an interesting literature. -I chose to start down at the basis and analyze the environment and \masqmail\ throughout to end in concrete plans of what should be done and how it should be done to turn \masqmail\ into a modern mail transfer agent again. - -The actual implementation of the the proposed changes goes beyond this thesis. Here are solutions identified, described, discussed, and recommended but not implemented. I did work in the code and have fix bugs during the time I wrote on the thesis, though. - - -\quad - -This document is primary written with an audience of \masqmail\ developers or developers of other mail transfer agents in mind. But users of \masqmail\ and everyone who is interested in email systems in general may find this thesis an interesting literature. - -However, at least basic knowledge about \unix\ and C programming is preconditioned in chapter three, four, and five. The required knowledge about \unix\ can be gained from \person{Kernighan} and \person{Pike}'s ``The \NAME{UNIX} Programming Environment'' \cite{kernighan84}. Programming in the C language is best learned from \person{Kernighan} and \person{Ritchie}'s ``The C Programming Language'' \cite{k&r}. - - - - - - -%fixme: << hikernet >> - - -%fixme: how to get the masqmail source code +However, at least basic knowledge about \unix\ and C programming is preconditioned in chapter three, four, and five. \person{Kernighan} and \person{Pike}'s ``The \NAME{UNIX} Programming Environment'' \cite{kernighan84} is a valuable source to gain information about \unix. Programming in the C language is best learned from \person{Kernighan} and \person{Ritchie}'s ``The C Programming Language'' \cite{k&r}. @@ -43,11 +31,11 @@ \section*{Organization} -Six chapters structure the document. Each one covers a delimited part of the overall topic and builds upon the knowledge and results of the previous ones. The first three chapters lead into the topic and create a solid base where the second part builds upon. Chapter four and five are the central part of the thesis as they focus on \masqmail. +Six chapters structure this document. Each one covers a delimited part of the overall topic and builds upon the knowledge and results of the previous ones. The first three chapters lead into the topic and create a solid base where the second part builds upon. Chapter four and five are the central part of the thesis as they focus on \masqmail. Chapter 1 \textbf{introduces} \masqmail\ to the reader. It presents the properties, goals, advantages, and problems of the program. Basic concepts of the email technology are also described and later assumed to be know. -Chapter 2 \textbf{analyzes the market} of electronic communication and email. This chapter provides a secure basis by showing that email will remain an important technology in the future. It tries to identify future trends too. +Chapter 2 \textbf{analyzes the market} of electronic communication and email. This chapter shows that email will remain an important technology in the future which is a precondition for investing effort into it. It tries to identify future trends too. Chapter 3 \textbf{deals with mail transfer agents} (\MTA{}s) which are the most important entities of the email transport structure. \MTA{}s are defined, classified, and important ones are presented and compared. @@ -81,7 +69,7 @@ \item Websites differ from documents as they are less of a text written by some author but more a place where information is gathered. They are only indicated by numbers, like for example: \citeweb{masqmail:homepage}. -\item \name{Request for Comments}---the documents that define the Internet---are referenced in a third way, by specifying the unique number of the \RFC\ directly: \RFC821. +\item \name{Request for Comments}---the documents that define the Internet---are referenced in a third way, by specifying the unique number of the \RFC\ directly: \RFC\,821. \end{enumerate} The Bibliography is located at the end of the thesis. It also includes a list of the relevant \RFC{}s and how they can be retrieved. diff -r ba9463b43709 -r 80b2e476c2e3 thesis/tex/1-Introduction.tex --- a/thesis/tex/1-Introduction.tex Wed Jan 28 16:49:45 2009 +0100 +++ b/thesis/tex/1-Introduction.tex Fri Jan 30 21:20:00 2009 +0100 @@ -1,7 +1,7 @@ \chapter{Introduction} \label{chap:introduction} -This chapter first introduces some basic email concepts that are essential to understand the rest of the thesis. Then \masqmail---the program of interest---is presented. History, typical usage, and the function it provides are described. After an explanation of \masqmail's worth, its problems are pointed out. These problems which are to solve are the topics that are covered throughout this thesis. +This chapter first introduces some basic email concepts that are essential for understanding the rest of the thesis. Then \masqmail---the program of interest---is presented. History, typical usage, and the function it provides are described. After an explanation of \masqmail's worth, its problems are pointed out. These problems which are to solve are the topics that are covered throughout this thesis. @@ -9,14 +9,14 @@ \section{Email prerequisites} -Electronic mail is a service on the Internet and thus, like other Internet services, defined and standardized by \RFC{}s under management of the \name{Internet Engineering Task Force} (short: \NAME{IETF}). \RFC{}s are highly technical documents and it is not expected that the readers of this thesis are familiar with them. +Electronic mail is a service on the Internet and thus, like other Internet services, defined and standardized by \RFC{}s under management of the \name{Internet Engineering Task Force} (short: \NAME{IETF}). \RFC{}s are highly technical documents and it is not required that the readers of this thesis are familiar with them. This section gives an introduction into the basic internals of the email system in a low-technical language. It is intended to make the reader familiar with the essential concepts of email. They are assumed to be known in the rest of the thesis. \subsubsection{Mail agents} -This thesis will frequently use the three terms: \MTA, \NAME{MUA}, and \NAME{MDA}. The name the three different kinds of software that are the nodes of the email infrastructure. Here they are explained with references to the snail mail system which is known from everyday life. Figure \ref{fig:mail-agents} shows the relation between those three mail agents and the way an email message takes trough the system. +This thesis will frequently use the three terms: \MTA, \NAME{MUA}, and \NAME{MDA}. They name the three different kinds of software that are the nodes of the email infrastructure. Here they are explained with references to the ``snail mail'' system which is known from everyday life. Figure \ref{fig:mail-agents} shows the relation between those three mail agents and the way an email message takes trough the system. \begin{description} \item[\MTA:] @@ -26,7 +26,7 @@ \name{Mail User Agents} are the software the user deals with. He writes and reads email with it. The \NAME{MUA} passes outgoing mail to the nearest \MTA. Also the \NAME{MUA} displays the contents of the user's mailbox. Well known \NAME{MUA}s are \name{Mozilla Thunderbird} and \name{mutt} on \unix\ systems, and \name{Microsoft Outlook} on \name{Windows}. \item[\NAME{MDA}:] -\name{Mail Delivery Agents} correspond to postmen in the real world. They receive mail, destined to recipients they are responsible for, from an \MTA, and deliver it to the mailboxes of those recipients. Many \MTA{}s include an own \NAME{MDA}, but specialized ones exist: \name{procmail} and \name{maildrop} are examples. +\name{Mail Delivery Agents} correspond to postmen in the real world. They receive mail, destined to recipients they are responsible for, from an \MTA, and deliver it to the mailboxes of those recipients. Many \MTA{}s include an own \NAME{MDA}, but independent ones exist: \name{procmail} and \name{maildrop} are examples. \end{description} \begin{figure} @@ -44,23 +44,23 @@ \subsubsection{Mail transfer with SMTP} -Today most of the email is transferred using the \name{Simple Mail Transfer Protocol} (short: \SMTP), which is defined in \RFC821 and the successors \RFC2821 and \RFC5321. A good entry point for further information is \citeweb{wikipedia:smtp}. +Today most of the email is transferred using the \name{Simple Mail Transfer Protocol} (short: \SMTP), which is defined in \RFC\,821 and the successors \RFC\,2821 and \RFC\,5321. A good entry point for further information is \citeweb{wikipedia:smtp}. A selection of important concepts of \SMTP\ is explained here. First the \name{store and forward} transfer concept. This means mail messages are sent from \MTA\ to \MTA, until the final \MTA\ (the one which is responsible for the recipient) is reached. The message is gets stored for some time on each \MTA, until it is forwarded to the next \MTA. -This leads to the concept of \name{responsibility}. A mail message is always in the responsibility of one system. First it is the \NAME{MUA}. After it was transferred to the first \MTA, it takes the responsibility for the message over. The \NAME{MUA} can then delete its copy of the message. This is the same for each transfer, from \MTA\ to \MTA\ and finally from \MTA\ to the \NAME{MDA}, the message gets transferred and if the transfer was successful, the responsibility for the message is transferred as well. The responsibility chain ends at a user's mailbox, where he himself has control on the message. +This leads to the concept of \name{responsibility}. A mail message is always in the responsibility of one system. First it is the \NAME{MUA}. When it is transferred to an \MTA, this \MTA\ takes over the responsibility for the message too. The \NAME{MUA} can then delete its copy of the message. This is the same for each transfer---from \MTA\ to \MTA\ and finally from \MTA\ to the \NAME{MDA}---the message gets transferred and if the transfer was successful, the responsibility for the message is transferred as well. The responsibility chain ends at a user's mailbox where he himself has control on the message. -A third concept is about failure handling. At any step on the way, an \MTA\ may receive a message it is unable to handle. In such a case, this receiving \MTA\ will \name{reject} the message before it takes responsibility for it. The sending \MTA\ still has responsibility for the message and may try other ways for sending the message. If none succeeds, the \MTA\ will send a \name{bounce message} back to the original sender with information on the type of failure. Bounces are only sent if the failure is expected to be permanent, or if the transfer still was unsuccessful after many tries. +A third concept is about failure handling. At any step on the way an \MTA\ may receive a message it is unable to handle. In such a case this receiving \MTA\ will \name{reject} the message before it takes responsibility for it. The sending \MTA\ still has responsibility for the message and may try other ways for sending the message. If none succeeds the \MTA\ will send a \name{bounce message} back to the original sender with information on the type of failure. Bounces are only sent if the failure is expected to be permanent or if the transfer still was unsuccessful after many tries. \subsubsection{Mail messages} -Mail messages consist of text in a specific format. This format is specified in \RFC822, and the successors \RFC2822 and \RFC5322. +Mail messages consist of text in a specific format. This format is specified in \RFC\,822, and the successors \RFC\,2822 and \RFC\,5322. -A message has two parts, the \name{header} and the \name{body}. The header of an email message is similar to the header of a (formal) letter. It spans the first lines of the message up to the first empty line. The header consists of several lines, called \name{header lines} or simply \name{headers}. They specify the sender, the address(es) of the recipient(s), the date, and possibly further information. Their order is irrelevant. Headers are named after the colon separated start of those lines, for example the ``\texttt{Date:}'' header. A user may write the header himself, but normally the \NAME{MUA} does this job. +A message has two parts, the \name{header} and the \name{body}. The header of an email message is similar to the header of a (formal) letter. It spans the first lines of the message up to the first empty line. The header consists of several lines, called \name{header lines} or simply \name{headers}. They specify the sender, the recipient(s), the date, and possibly further information. Their order is irrelevant. Headers are named like the colon-separated start of those lines, for example the ``\texttt{Date:}'' header. A user may write the header himself but normally the \NAME{MUA} does this job. The body is the payload of the message. It is under full control of the user. From the view point of the \SMTP\ protocol, it must consist of only 7-bit \NAME{ASCII} text. But arbitrary content can be included by encoding it to 7-bit \NAME{ASCII}. \NAME{MIME} is the common \SMTP\ extension to handle such conversion automatically in \NAME{MUA}s. @@ -68,7 +68,7 @@ \codeinput{input/sample-email.txt} -Email messages are put into envelopes for transfer. This concept is derived from the real world, so it is easy to understand. The envelope is used to route the message from sender to recipient. It contains the sender's address and addresses of one or more recipients. Envelopes are generated by \MTA{}s, usually by using mail header data. The user has not to deal with them. +Email messages are put into \name{envelopes} for transfer. This concept is also derived from the real world so it is easy to understand. The envelope is used to route the message from sender to recipient. It contains the sender's address and addresses of one or more recipients. Envelopes are generated by \MTA{}s, usually from mail header data. The user has not to deal with them. Each \MTA\ on the way reads envelopes it receives and generates new ones. If a message has recipients on different hosts, then the message gets copied and sent within multiple envelopes, one for each host. @@ -82,32 +82,32 @@ \section{The \masqmail\ project} \label{sec:masqmail} -The \masqmail\ project was by \person{Oliver Kurth} in 1999. His aim was to create a small \MTA\ that is especially focused on computers with dial-up Internet connections. Throughout the next four years, he worked steadily on it, releasing new versions every few weeks. In total it were 53 releases, which is in average a new version every 20 days. +The \masqmail\ project was started by \person{Oliver Kurth} in 1999. His aim was to create a small \MTA\ that is especially focused on computers with dial-up Internet connections. Throughout the next four years he worked steadily on it, releasing new versions every few weeks. In total it were 53 releases which is in average a new version every 20 days. -This thesis bases on the latest release of \masqmail---version 0.2.21 from November 2005. It was released after a 28 month gap. The source code of 0.2.21 is the same as of 0.2.20, only build documents were modified. The release tarball can be retrieved from the \debian\ package pool\footnote{The \NAME{URL} is: \url{http://ftp.de.debian.org/debian/pool/main/m/masqmail/masqmail_0.2.21.orig.tar.gz}\,.} \citeweb{packages.debian}. Probably was only put into public in the \debian\ pool because \masqmail's homepage \citeweb{masqmail:homepage2} does not include it. +This thesis bases on the latest release of \masqmail---version 0.2.21 from November 2005. It was released after a 28 month gap. The source code of 0.2.21 is the same as of 0.2.20, only build documents were modified. The release tarball can be retrieved from the \debian\ package pool\footnote{The \NAME{URL} is: \url{http://ftp.de.debian.org/debian/pool/main/m/masqmail/masqmail_0.2.21.orig.tar.gz}\,.} \citeweb{packages.debian}. It seems as if this version was only put into public there because \masqmail's homepage \citeweb{masqmail:homepage2} does not include it. -\masqmail\ is covered by the \name{General Public License} (short: \NAME{GPL}), which qualifies it as \freesw. +\masqmail\ is covered by the \name{General Public License} (short: \NAME{GPL}) which qualifies it as \freesw. -\person{Kurth} abandoned \masqmail\ after 2005, and no one adopted the project since then. Thus, the author of this thesis decided to take responsibility for \masqmail\ now. He received \person{Kurth}'s permission to do so. +\person{Kurth} abandoned \masqmail\ after 2005 and no one adopted the project since then. Thus, the author of this thesis decided to take responsibility for \masqmail\ now. He received \person{Kurth}'s permission to do so. The program's new homepage \citeweb{masqmail:homepage} is a collection of available information about this \MTA. -\subsection{Target field of \masqmail} +\subsection{Target field} \label{sec:masqmail-target-field} The intention \person{Kurth} had when creating \masqmail\ is best told in his own words: \begin{quote} -MasqMail is a mail server designed for hosts that do not have a permanent internet connection eg. a home network or a single host at home. It has special support for connections to different ISPs. It replaces sendmail or other MTAs such as qmail or exim. +MasqMail is a mail server designed for hosts that do not have a permanent internet connection eg. a home network or a single host at home. It has special support for connections to different \NAME{ISP}s. It replaces sendmail or other \MTA{}s such as qmail or exim. \hfill\citeweb{masqmail:homepage2} \end{quote} -It is intended to cover a specific niche: non-permanent Internet connection and different \NAME{ISP}s. +It is intended to cover a specific niche: non-permanent Internet connection and different \name{Internet Service Providers} (short: \NAME{ISP}s). -Although it can basically replace other \MTA{}s, it is not \emph{generally} aimed to do so. The package description of \debian\ states this more clearly by changing the last sentence to: +Although it can basically replace other \MTA{}s it is not \emph{generally} aimed to do so. The package description of \masqmail\ within \debian\ states this more clearly by changing the last sentence to: \begin{quote} -In these cases, MasqMail is a slim replacement for full-blown MTAs such as sendmail, exim, qmail or postfix. +In these cases, MasqMail is a slim replacement for full-blown \MTA{}s such as sendmail, exim, qmail or postfix. \hfill\citeweb{packages.debian:masqmail} \end{quote} The program is a good replacement ``in these cases'', but not generally, since is lacks essential features for running on mail servers. It is primarily not secure enough for being accessible from untrusted locations. diff -r ba9463b43709 -r 80b2e476c2e3 thesis/tex/2-MarketAnalysis.tex --- a/thesis/tex/2-MarketAnalysis.tex Wed Jan 28 16:49:45 2009 +0100 +++ b/thesis/tex/2-MarketAnalysis.tex Fri Jan 30 21:20:00 2009 +0100 @@ -230,7 +230,7 @@ -\subsection{Important in future} +\subsection{Importances in future} \label{sec:what-will-be-important} Provider independence through running an own mail server at home asks for easy configuration of the \MTA. Providers have specialists to configure the systems, but ordinary people do not. Solutions are either having some home service system for computer configuration established with specialists coming to ones home to set up the systems; like it is already common for problems with the power and water supply systems. Or configuration needs to be easy and fool-prove, to be done by the owner himself. The latter solution depends on standardized parts that fit together seamlessly. The technology must not be a problem itself. Only settings custom to the users environment should be left open for him to set. This of course needs to be doable using a simple configuration interface like a web interface. Non-technical educated users should be able to configure the system. diff -r ba9463b43709 -r 80b2e476c2e3 thesis/tex/4-MasqmailsFuture.tex --- a/thesis/tex/4-MasqmailsFuture.tex Wed Jan 28 16:49:45 2009 +0100 +++ b/thesis/tex/4-MasqmailsFuture.tex Fri Jan 30 21:20:00 2009 +0100 @@ -141,9 +141,9 @@ \label{fig:stunnel} \end{figure} -To provide encrypted incoming channels, the \MTA\ could implement encryption and listen on a port that is dedicated to encrypted \SMTP\ (\NAME{SMTPS}). This approach would be possible, but it is deprecated in favor for \NAME{STARTTLS}. \RFC3207 ``\SMTP\ Service Extension for Secure \SMTP\ over Transport Layer Security'' shows this by not mentioning \NAME{SMTPS} on port 465. Also port 465 is not even reserved for \NAME{SMTPS} anymore \citeweb{iana:port-numbers}. +To provide encrypted incoming channels, the \MTA\ could implement encryption and listen on a port that is dedicated to encrypted \SMTP\ (\NAME{SMTPS}). This approach would be possible, but it is deprecated in favor for \NAME{STARTTLS}. \RFC\,3207 ``\SMTP\ Service Extension for Secure \SMTP\ over Transport Layer Security'' shows this by not mentioning \NAME{SMTPS} on port 465. Also port 465 is not even reserved for \NAME{SMTPS} anymore \citeweb{iana:port-numbers}. -\NAME{STARTTLS}---defined in \RFC2487---is what \RFC3207 recommends to use for secure \SMTP. The connection then goes over port 25 (or the submission port 587), but gets encrypted as the \NAME{STARTTLS} keyword is issued. Email depends on compatibility---only encryption methods that client and server support can be used. Hence it is best to act after the recommendations of the \RFC\ documents. This means \NAME{STARTTLS} encryption should be supported for incoming and for outgoing connections. +\NAME{STARTTLS}---defined in \RFC\,2487---is what \RFC\,3207 recommends to use for secure \SMTP. The connection then goes over port 25 (or the submission port 587), but gets encrypted as the \NAME{STARTTLS} keyword is issued. Email depends on compatibility---only encryption methods that client and server support can be used. Hence it is best to act after the recommendations of the \RFC\ documents. This means \NAME{STARTTLS} encryption should be supported for incoming and for outgoing connections. diff -r ba9463b43709 -r 80b2e476c2e3 thesis/tex/5-Improvements.tex --- a/thesis/tex/5-Improvements.tex Wed Jan 28 16:49:45 2009 +0100 +++ b/thesis/tex/5-Improvements.tex Fri Jan 30 21:20:00 2009 +0100 @@ -282,7 +282,7 @@ The two approaches for spam handling were already presented to the reader in section \ref{sec:functional-requirements} as \RF\,8 and \RF\,9. Here they are described in more detail: \begin{enumerate} -\item Refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and recipient mail addresses would be enough, but as they are forgeable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time during the \SMTP\ dialog can be used for checking if a message seems to be spam. The advantage is that bad messages can simply get refused---no responsibility for the message is taken and no further system load is added. See \RFC2505 (especially section 1.5) for detail. +\item Refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and recipient mail addresses would be enough, but as they are forgeable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time during the \SMTP\ dialog can be used for checking if a message seems to be spam. The advantage is that bad messages can simply get refused---no responsibility for the message is taken and no further system load is added. See \RFC\,2505 (especially section 1.5) for detail. \item Checking for spam after the mail was accepted and queued. Here more processing time can be invested, thus more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existing header lines. Thus all further work on the message is the same as for non-spam messages. \end{enumerate}