comparison thesis/tex/4-MasqmailsFuture.tex @ 177:7781ad0811f7

new sections, moved content, and more
author meillo@marmaro.de
date Fri, 26 Dec 2008 22:38:07 +0100
parents c51f1be54224
children b426a663d5f0
comparison
equal deleted inserted replaced
176:d4f818a4da04 177:7781ad0811f7
18 %sendmail: hoststat, mailq, newaliases, purgestat, smtpd 18 %sendmail: hoststat, mailq, newaliases, purgestat, smtpd
19 19
20 Additional to the \mta\ job, \masqmail\ also offers mail retrieval services with being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online route. 20 Additional to the \mta\ job, \masqmail\ also offers mail retrieval services with being a \NAME{POP3} client. It can fetch mail from different remote locations, dependent on the active online route.
21 21
22 22
23
24 \subsubsection*{The code} 23 \subsubsection*{The code}
25 24
26 \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library however has higher precedence in linking. 25 \masqmail\ is written in the C programming language. The program, as of version 0.2.21, consists of 34 source code and eight header files, containing about 9,000 lines of code\footnote{Measured with \name{sloccount} by David A.\ Wheeler.}. Additionally, it includes a \name{base64} implementation (about 300 lines) and \name{md5} code (about 150 lines). For systems that do not provide \name{libident}, this library is distributed as well (circa 600 lines); an available shared library however has higher precedence in linking.
27 26
28 The only mandatory dependency is \name{glib}---a cross-platform software utility library, originated in the \NAME{GTK+} project. It provides safer replacements for many standard library functions. It also offers handy data containers, easy-to-use implementations of data structures, and much more. 27 The only mandatory dependency is \name{glib}---a cross-platform software utility library, originated in the \NAME{GTK+} project. It provides safer replacements for many standard library functions. It also offers handy data containers, easy-to-use implementations of data structures, and much more.
35 34
36 35
37 36
38 37
39 38
40 \section{\masqmail\ next generation} 39 \section{Requirements}
41 40
42 \subsection{Requirements} 41 This section identifies the requirements for future version of \masqmail. Most of them will apply to modern \MTA{}s in general.
42
43 \subsection{General requirements}
43 44
44 Following is a list of current and future requirements to make \masqmail\ ready for the future. 45 Following is a list of current and future requirements to make \masqmail\ ready for the future.
45 46
46 47
47 \subsubsection*{Large message handling} 48 \subsubsection*{Security}
48 Trends in the market for electronic communication go towards consolidated communication, hence email will be used more to transfer voice and video messages. This leads to larger messages. The store-and-forward transport of email is not good suited for large data. Thus new protocols, like \NAME{QMTP} (described in section %\ref{FIXME} 49 \MTA{}s are critical points for computer security, as they are accessable from external networks. They must be secured with high effort. Properties like high priviledge level, work load influenced from extern, work on unsafe data, and demand for reliability, increase the security needed. Unsecure and unreliable \mta{}s are of no value. \masqmail\ needs to b e secure enough for its target field of operation.
49 ), may become popular. 50
51
52 \subsubsection*{Reliability}
53 << crash only software >>
54
55 << dont lose mail >>
56
57
58 \subsubsection*{Extendability}
59 Modern needs like large messages demand for more efficient mail transport through the net. Aswell is a final solution needed to defeat the spam problem. New mail transport protocols seem to be the only good solutions for both problems. They also can improve reliability, authentication, and verification issues. \masqmail\ should be able to support new mail transfer protocols as they appear and are used.
60 %fixme: like old sendmail, but not too much like it
50 61
51 62
52 \subsubsection*{Ressource friendly software} 63 \subsubsection*{Ressource friendly software}
53 The merge of communication hardware and the move of email services from providers to homes, demands smaller and more resource-friendly software. The amount of mail will be lower, even if much more mail will be sent. More important will be the energy consumption and heat emission. These topics increased in relevance during the past years and they are expected to become more central. \masqmail\ is not a program to be used on large servers, but to be used on small devices. Thus focusing on energy and heat, not on performance, is the direction to go. 64 The merge of communication hardware and the move of email services from providers to homes, demands smaller and more resource-friendly software. The amount of mail will be lower, even if much more mail will be sent. More important will be the energy consumption and heat emission. These topics increased in relevance during the past years and they are expected to become more central. \masqmail\ is not a program to be used on large servers, but to be used on small devices. Thus focusing on energy and heat, not on performance, is the direction to go.
54
55
56 \subsubsection*{New mail transfer protocols}
57 Large messages demand more efficient transport through the net. As well is a final solution needed to defeat the spam problem. New mail transport protocols may be the only good solutions for both problems. They also can improve reliability, authentication, and verification issues. \masqmail\ should be able to support new protocols as they appear and are used.
58
59
60 \subsubsection*{Spam handling}
61 Spam is a major threat. According to the \NAME{SWOT} analysis, the goal is to reduce it to a bearable level. Spam fighting is a war are where the good guys tend to lose. Putting too much effort there will result in few gain. Real success will only be possible with new---better---protocols and abandonning the weak legacy technologies. Hence \masqmail\ should be able to provide state-of-the-art spam protection, but not more.
62
63
64 \subsubsection*{Security}
65 \MTA{}s are critical points for computer security, as they are accessable from external networks. They must be secured with high effort. Properties like high priviledge level, work load influenced from extern, work on unsafe data, and demand for reliability, increase the security needed. Unsecure and unreliable \mta{}s are of no value. \masqmail\ needs to b e secure enough for its target field of operation.
66 65
67 66
68 \subsubsection*{Easy configuration} 67 \subsubsection*{Easy configuration}
69 Having \mta{}s on many home servers and clients, requires easy and standardized configuration. The common setups should be configurable with single actions by the user. Complex configuration should be possible, but focused must be the most common form of configuration: choosing one of several standard setups. 68 Having \mta{}s on many home servers and clients, requires easy and standardized configuration. The common setups should be configurable with single actions by the user. Complex configuration should be possible, but focused must be the most common form of configuration: choosing one of several standard setups.
70 69
116 All this leads to one logical step: The rewrite of \masqmail\ using a modern, modular architecture, to get a modern \MTA\ satisfying nowadays needs. 115 All this leads to one logical step: The rewrite of \masqmail\ using a modern, modular architecture, to get a modern \MTA\ satisfying nowadays needs.
117 116
118 117
119 118
120 119
120
121
122 http://fanf.livejournal.com/50917.html %how not to design an mta - the sendmail command
123 http://fanf.livejournal.com/51349.html %how not to design an mta - partitioning for security
124 http://fanf.livejournal.com/61132.html %how not to design an mta - local delivery
125 http://fanf.livejournal.com/64941.html %how not to design an mta - spool file format
126 http://fanf.livejournal.com/65203.html %how not to design an mta - spool file logistics
127 http://fanf.livejournal.com/65911.html %how not to design an mta - more about log-structured MTA queues
128 http://fanf.livejournal.com/67297.html %how not to design an mta - more log-structured MTA queues
129 http://fanf.livejournal.com/70432.html %how not to design an mta - address verification
130 http://fanf.livejournal.com/72258.html %how not to design an mta - content scanning
131
132
133
134
135
136
137
121 \subsection{Jobs of an MTA} 138 \subsection{Jobs of an MTA}
122 139
123 This section tries to identify the needed modules for a modern \MTA. They are later the pieces of which the new architecture is built of. 140 This section tries to identify the needed modules for a modern \MTA. They are later the pieces of which the new architecture is built of.
124 141
125 The basic job of a \mta\ is to tranport mail from a sender to a recipient. This is the definition of such a program and this is how \person{Dent}\cite[page 19]{dent04} and \person{Hafiz} \cite[pages 3-5]{hafiz05} generally see its design. 142 The basic job of a \mta\ is to tranport mail from a sender to a recipient. This is the definition of such a program and this is how \person{Dent}\cite[page 19]{dent04} and \person{Hafiz} \cite[pages 3-5]{hafiz05} generally see its design.
284 301
285 302
286 303
287 \subsubsection*{Spam prevention} 304 \subsubsection*{Spam prevention}
288 305
306 ---
307 Spam is a major threat nowadays and the goal is to reduce it to a bearable level (see section \ref{sec:swot-analysis}). Spam fighting is a war are where the good guys tend to lose. Putting too much effort there will result in few gain. Real success will only be possible with new---better---protocols and abandonning the weak legacy technologies. Hence \masqmail\ should be able to provide state-of-the-art spam protection, but not more.
308 ---
309
289 Spam is a major threat to email, as described in section \ref{sec:swot-analysis}. The two main problems are forgable sender addresses and that it is cheap to send hundreds of thousands of messages. Hence, spam senders can operate in disguise and have minimal cost. 310 Spam is a major threat to email, as described in section \ref{sec:swot-analysis}. The two main problems are forgable sender addresses and that it is cheap to send hundreds of thousands of messages. Hence, spam senders can operate in disguise and have minimal cost.
290 311
291 As spam is not just a nuisance for end users, but also for the infrastructure---the \mta{}s---by increasing the amount of mail messages, \MTA{}s need to protect themself. Two approaches are used. 312 As spam is not just a nuisance for end users, but also for the infrastructure---the \mta{}s---by increasing the amount of mail messages, \MTA{}s need to protect themself. Two approaches are used.
292 313
293 First refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and reciptient mail addresses would be enough, but as they are forgable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time can be used, during the \SMTP\ dialog, for checking if a message seems to be spam. The advantage is that acceptance of bad messages can be simply refused---no responsibility for the message is takes and no further system load is added. 314 First refusing spam during the \SMTP\ dialog. This is the way it was meant by the designers of the \SMTP\ protocol. They thought checking the sender and reciptient mail addresses would be enough, but as they are forgable it is not. More and more complex checks need to be done. Checking needs time, but \SMTP\ dialogs time out if it takes too long. Thus only limited time can be used, during the \SMTP\ dialog, for checking if a message seems to be spam. The advantage is that acceptance of bad messages can be simply refused---no responsibility for the message is takes and no further system load is added. See \RFC2505 (especially section 1.5) for detail.
294 315
295 Second checking for spam after the mail was accepted and queued. Here more processing time can be invested, so more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existent header lines. Thus all further work on the message is the same as for non-spam messages. 316 Second checking for spam after the mail was accepted and queued. Here more processing time can be invested, so more detailed checks can be done. But, as responsibility for messages was taken by accepting them, it is no choice to simply delete spam mail. Checks for spam do not lead to sure results, they just indicate the possibility the message is unwanted mail. \person{Eisentraut} indicates actions to take after a message is recognized as probably spam \cite[pages 18--20]{eisentraut05}. The only acceptable one, for mail the \MTA\ is responsible for, is adding further or rewriting existent header lines. Thus all further work on the message is the same as for non-spam messages.
296 317
297 Modern \MTA{}s use both techniques in combination. Checks during the \SMTP\ dialog tend to be implemented in the \mta\ to make it fast; checks after the message was queued are often done using external programs (\name{spamassassin} is a well known one). \person{Eisentraut} sees the checks during the \SMTP\ dialog to be essentiell: ``Ganz ohne Analyse während der SMTP-Phase kommt sowieso kein MTA aus, und es ist eine Frage der Einschätzung, wie weit man diese Phase belasten möchte.''\cite[page 25]{eisentraut05} (translated: ``No \MTA\ can go without analysis during the \SMTP\ dialog, anyway, and it is a question of estimation how much to stress this period.'') 318 Modern \MTA{}s use both techniques in combination. Checks during the \SMTP\ dialog tend to be implemented in the \mta\ to make it fast; checks after the message was queued are often done using external programs (\name{spamassassin} is a well known one). \person{Eisentraut} sees the checks during the \SMTP\ dialog to be essentiell: ``Ganz ohne Analyse während der SMTP-Phase kommt sowieso kein MTA aus, und es ist eine Frage der Einschätzung, wie weit man diese Phase belasten möchte.''\cite[page 25]{eisentraut05} (translated: ``No \MTA\ can go without analysis during the \SMTP\ dialog, anyway, and it is a question of estimation how much to stress this period.'')
298 319
303 324
304 \subsubsection*{Virus checking} 325 \subsubsection*{Virus checking}
305 326
306 Related to spam is malicous content (short: \name{malware}) like viruses, worms, trojan horses. They, in contrast to spam, do not affect the \MTA\ itself, as they are in the mail body. The same situation in the real world is post offices opening letters to check if they contain something that could harm the recipient. This is not a mail transport concern. Apart of not being the right program to do the job, the \MTA\---the one which is responsible for the recipient---is at a good position to do this work. 327 Related to spam is malicous content (short: \name{malware}) like viruses, worms, trojan horses. They, in contrast to spam, do not affect the \MTA\ itself, as they are in the mail body. The same situation in the real world is post offices opening letters to check if they contain something that could harm the recipient. This is not a mail transport concern. Apart of not being the right program to do the job, the \MTA\---the one which is responsible for the recipient---is at a good position to do this work.
307 328
308 In any way should malware checking be done by external programs that may be invoked by the \mta. But using mail deliver and processing agents, like \name{procmail}, appear to be better suited locations to invoke content scanners. 329 In any way should malware checking be done by external programs that may be invoked by the \mta. But using mail deliver and processing agents, like \name{procmail}, seem to be better suited locations to invoke content scanners.
309 330
310 331 A popular email filter framework is \name{amavis} which integrates various spam and virus scanners. The common setup includes a receiving \MTA\ which sends it to \name{amavis} using \SMTP, \name{amavis} processes the mail and sends it then to a second \MTA\ that does the outgoing transfer. \postfix\ and \exim\ can be configured so that one instance can work as both, the \MTA\ for incoming and outgoing transfer. A setup with \sendmail\ needs two separate instances running. It must be quarateed that all mail flows through the scanner.
311 332
312 AMaViS (amavisd-new): email filter framework to integrate spam and virus scanner 333 A future \masqmail\ would do good to have a single point, where all traffic flows through, that is able to invoke external programs to do mail processing of any kind.
313 \begin{verbatim} 334
314 internet -->25 MTA -->10024 amavis -->10025 MTA --> reciptient 335
315 | | 336 %AMaViS (amavisd-new): email filter framework to integrate spam and virus scanner
316 +----------------------------+ 337 %\begin{verbatim}
317 \end{verbatim} 338 %internet -->25 MTA -->10024 amavis -->10025 MTA --> reciptient
318 339 %| |
319 postfix and exim can habe both mta servises in the same instance, sendmail needs two instances running. 340 %+----------------------------+
320 341 %\end{verbatim}
321 MailScanner: 342 %
322 incoming queue --> MailScanner --> outgoing queue 343 %postfix and exim can habe both mta servises in the same instance, sendmail needs two instances running.
323 344 %
324 postfix: with one instance possible, exim and sendmail need two instances running 345 %MailScanner:
346 %incoming queue --> MailScanner --> outgoing queue
347 %
348 %postfix: with one instance possible, exim and sendmail need two instances running
325 349
326 350
327 %message body <-> envelope, header 351 %message body <-> envelope, header
328 % 352 %
329 %anti-virus: clamav 353 %anti-virus: clamav
346 370
347 371
348 372
349 \subsubsection*{Archiving} 373 \subsubsection*{Archiving}
350 374
351 375 Mail archiving and auditability become more important as electronic mail becomes more important. Ability to archive verbatim copies of every mail coming into and every mail going out of the system, with relation between them, appears to be a goal to achieve.
352 \texttt{always\_bcc} feature of postfix 376
353 377 \postfix\ for example has a \texttt{always\_bcc} feature, to send a copy of every mail to a definable reciptient. At least this funtionality should be given, although a more complete approach is preferable.
354 378
355 379
356 \section{A new architecture} 380
357 381 \section{Merging the parts}
358 382
359 (ssl) 383 The last sections identified the jobs that need to be done by a modern \MTA; problems and prefered choices were mentioned too. Now the various jobs are assigned to modules, of which an architecture is created. It is inpired by existing ones and driven by the identified jobs and requirements.
360 -> msg-in (local or remote protocol handlers) 384
361 -> spam-filter (and more) 385 One major design idea of the design were:
362 -> queue 386 \begin{itemize}
363 -> msg-out (local-delivery by MDA, or remote-protocol-handlers) 387 \item free the internal system from in and out channels
364 (ssl) 388 \item arbitrary protocol handlers have to be addable afterwards
365 389 \item a single facility for scanning (all mail goes through it)
366 390 \item concentrate on mail transfer
367 391 \end{itemize}
368 392
369 393 The result is a symetric design, featuring any number of handlers for incoming connections to receive mail and pass it to the module that stores it into the incoming queue. A central scanning module take mail from the incoming queue, processes it in various ways and puts it afterwards into the outgoing queue. Another module takes it out there and passes it to a matching transport module that transfers it to the destination. In other words, three main modules (queue-in, scanning, queue-out) are connected by the two queues (incoming, outgoing); on each end are more modules to receive and send mail---for each protocol one. Figure \ref{fig:masqmail-arch-new} depicts the new designed architecture.
370 http://fanf.livejournal.com/50917.html %how not to design an mta - the sendmail command 394
371 http://fanf.livejournal.com/51349.html %how not to design an mta - partitioning for security 395 \begin{figure}
372 http://fanf.livejournal.com/61132.html %how not to design an mta - local delivery 396 \begin{center}
373 http://fanf.livejournal.com/64941.html %how not to design an mta - spool file format 397 \input{input/masqmail-arch-new.tex}
374 http://fanf.livejournal.com/65203.html %how not to design an mta - spool file logistics 398 \end{center}
375 http://fanf.livejournal.com/65911.html %how not to design an mta - more about log-structured MTA queues 399 \caption{A new designed architecture for \masqmail}
376 http://fanf.livejournal.com/67297.html %how not to design an mta - more log-structured MTA queues 400 \label{fig:masqmail-arch-new}
377 http://fanf.livejournal.com/70432.html %how not to design an mta - address verification 401 \end{figure}
378 http://fanf.livejournal.com/72258.html %how not to design an mta - content scanning 402
379 403
380 404
381 405 \subsection{Modules and queues}
382 406
383 407 \subsubsection*{queue-in}
384 408 \subsubsection*{scanning}
385 409 \subsubsection*{queue-out}
386 410
411 \subsubsection*{incoming queue}
412 \subsubsection*{outgoing queue}
413
414 \subsubsection*{receiver modules}
415 \subsubsection*{transport modules}
416
417
418
419 \subsection{Intermodule communication}
420
421
422
423 \subsection{Spool file format}
424
425
426 \subsection{Rights and permission}
387 427
388 428
389 429
390 430
391 431
397 437
398 438
399 \section{Directions to go} 439 \section{Directions to go}
400 440
401 This section discusses about what shapes \masqmail\ could have---which directions the development could go to. 441 This section discusses about what shapes \masqmail\ could have---which directions the development could go to.
402
403
404
405 442
406 443
407 \subsubsection*{\masqmail\ in five years} 444 \subsubsection*{\masqmail\ in five years}
408 445
409 Now how could \masqmail\ be like in, say, five years? 446 Now how could \masqmail\ be like in, say, five years?
416 453
417 << would one create it at all? >> 454 << would one create it at all? >>
418 455
419 --- 456 ---
420 457
421 << plans to get masqmail more popular again (if that is the goal) >> 458
422 459
423 << More users >> 460 \subsubsection*{Work to do}
424
425
426
427
428
429
430
431 \section{Work to do}
432 461
433 << short term goals --- long term goals >> 462 << short term goals --- long term goals >>
434 463
435 do it like sendmail: first do the most needed stuff on the old design to make it still usable. Then design a new version from scratch, for the future. 464 do it like sendmail: first do the most needed stuff on the old design to make it still usable. Then design a new version from scratch, for the future.
436 465