Mercurial > docs > master
view discussion.roff @ 234:eba3744fb238
Added my set of helper scripts.
Removes the spell makefile target as it was not use{d,ful} anyway.
Btw: I should have ran script/doubles before I printed the document. :-/
author | markus schnalke <meillo@marmaro.de> |
---|---|
date | Mon, 16 Jul 2012 11:23:30 +0200 |
parents | 77c87c38bff4 |
children |
line wrap: on
line source
.H0 "Discussion .P This main chapter discusses the practical work accomplished in the mmh project. It is structured along the goals chosen for the project. A selection of the work undertaken is described. .P This discussion compares the present version of mmh with the state of nmh at the time when the mmh project had started, i.e. fall 2011. Recent changes in nmh are rarely part of the discussion. .P Whenever lines of code are counted, David A. Wheeler's \fIsloccount\fP was used to measure the amount in a comparable way. .P For the reader's convenience, the structure of modern email systems is depicted in the following figure. It illustrates the path a message takes from sender to recipient. .sp 1.5 .KS .in 2c .so input/mail-agents.pic .KE .sp 1.5 .LP The ellipses denote mail agents, i.e. different jobs in email processing. These are: .IP "Mail User Agent (MUA) The only program users directly interact with. It includes functions to compose new mail, display received mail, and to manage the mail storage. It is called a \fImail client\fP as well. .IP "Mail Submission Agent (MSA) A special kind of Mail Transfer Agent, used to submit mail into the mail transport system. Often it is also called an MTA. .IP "Mail Transfer Agent (MTA) A node in the mail transport system. It transfers incoming mail to a transport node nearer to the final destination. An MTA may be the final destination itself. .IP "Mail Delivery Agent (MDA) Delivers mail according to a set of rules. Usually, the messages are stored to disk. .IP "Mail Retrieval Agent (MRA) Initiates the transfer of mail from a remote location to the local machine. (The dashed arrow in the figure represents the pull request.) .LP The dashed boxes represent entities that usually reside on single machines. The box on the lower left represents the sender's system. The box on the upper left represents the first mail transfer node. The box on the upper right represents the transfer node responsible for the destination address. The box on the lower right represents the recipient's system. Often, the boxes above the dotted line are servers on the Internet. Many mail clients, including nmh, include all of the components below the dotted line. This is not the case for mmh; it implements the MUA only. .\" -------------------------------------------------------------- .H1 "Streamlining .P MH once provided a complete email system. The community around nmh tries to keep nmh in similar shape. In fundamental contrast, mmh shall be an MUA only. I believe that the development of all-in-one mail systems is obsolete. Today, email is too complex to be fully covered by a single project. Such a project will not be able to excel in all aspects. Instead, the aspects of email should be covered by multiple projects, which then can be combined to form a complete system. Excellent implementations for the various aspects of email already exist. Just to name three examples: Postfix is a specialized MTA, Procmail is a specialized MDA, and Fetchmail is a specialized MRA. I believe that it is best to use such specialized tools instead of providing the same function once more as a side component. .P Doing something well requires focusing on a small set of specific aspects. Under the assumption that development which is focussed on a particular area produces better results there, specialized projects will be superior in their field of focus. Hence, all-in-one mail system projects \(en no matter if monolithic or modular \(en will never be the best choice in any of the fields. Even in providing the most consistent all-in-one system, they are likely to be beaten by projects that focus exclusively on the creation of a homogeneous system by integrating existing mail components. .P Usually, the limiting resource in the community development of free software is man power. If the development effort is spread over a large development area, it becomes more difficult to compete with the specialists in the various fields. The concrete situation for MH-based mail systems is even tougher, given their small and aged community, concerning both developers and users. .P In consequence, I believe that the available development resources should focus on the point where MH is most unique. This is clearly the user interface \(en the MUA. Peripheral parts should be removed to streamline mmh for the MUA task. .H2 "Mail Transfer Facilities .Id mail-transfer-facilities .P The removal of the mail transfer facilities, effectively dropping the MSA and MRA, had been the first work task in the mmh project. The desire for this change initiated the creation of the mmh project. .P Focusing on one mail agent role only, is motivated by Eric Allman's experience with Sendmail. He identified the limitation of Sendmail to the MTA task as one reason for its success: .[ [ costales sendmail .], p. xviii] .QS Second, I limited myself to the routing function \(en I wouldn't write user agents or delivery back-ends. This was a departure of the dominant thought of the time, in which routing logic, local delivery, and often the network code were incorporated directly into the user agents. .QE .P In nmh, the MSA is called \fIMessage Transfer Service\fP (MTS). This facility, implemented by the .Pn post command, establishes network connections and spoke SMTP to submit messages to be relayed to the outside world. When email transfer changed, this part needed to be changed as well. Encryption and authentication for network connections needed to be supported, hence TLS and SASL were introduced into nmh. This added complexity without improving the core functions. Furthermore, keeping up with recent developments in the field of mail transfer requires development power and specialists. In mmh, this whole facility was simply cut off .Ci f6aa95b724fd8c791164abe7ee5468bf5c34f226 .Ci fecd5d34f65597a4dfa16aeabea7d74b191532c3 .Ci 156d35f6425bea4c1ed3c4c79783dc613379c65b . Instead, mmh depends on an external MSA. All outgoing mail in mmh goes through the .Pn sendmail command, which almost any MSA provides. If not, a wrapper program can be written. It must read the message from the standard input, extract the recipient addresses from the message header, and hand the message over to the MSA. For example, a wrapper script for qmail would be: .VS #!/bin/sh exec qmail-inject # ignore command line arguments VE The requirement to parse the recipient addresses out of the message header may be removed in the future. Mmh could pass the recipient addresses as command line arguments. This appears to be the better interface. .P To retrieve mail, the .Pn inc command in nmh acts as MRA. It establishes network connections and speaks POP3 to retrieve mail from remote servers. As with mail submission, the network connections required encryption and authentication, thus TLS and SASL were added to nmh. Support for message retrieval through IMAP will soon become necessary additions, too, and likewise for any other changes in mail transfer. But not in mmh because it has dropped the support for retrieving mail from remote locations .Ci ab7b48411962d26439f92f35ed084d3d6275459c . Instead, it depends on an external tool to cover this task. Mmh has two paths for messages to enter mmh's mail storage: (1) Mail can be incorporated with .Pn inc from the system maildrop, or (2) with .Pn rcvstore by reading them, one at a time, from the standard input. .P With the removal of the MSA and MRA, mmh converted from a complete mail system to only an MUA. Now, of course, mmh depends on third-party software. An external MSA is required to transfer mail to the outside world; an external MRA is required to retrieve mail from remote machines. Excellent implementations of such software exist. They likely are superior to the internal versions that were removed. Additionally, the best suiting programs can be chosen freely. .P As it had already been possible to use an external MSA and MRA, why should the internal version not be kept for convenience? Transferred to a different area, the question, whether there is sense in having a fall-back pager in all the command line tools for the cases when .Pn more or .Pn less are not available, appears to be ridiculous. Of course, MSAs and MRAs are more complex than text pagers and not necessarily available but still the concept of orthogonal design holds: ``Write programs that do one thing and do it well''. .[ mcilroy unix phil p. 53 .] .[ mcilroy bstj foreword .] Here, this part of the Unix philosophy was applied not only to the programs but to the project itself. In other words: Develop projects that focus on one thing and do it well. Projects which have grown complex should be split, for the same reasons that programs which have grown complex should be split. If it is conceptionally more elegant to have the MSA and MRA as separate projects then they should be separated. In my opinion, this is the case. The RFCs suggest this separation by clearly distinguishing the different mail handling tasks [RFC\|821]. The small interfaces between the mail agents support the separation as well. .P Once, email had been small and simple. At that time, .Pn /bin/mail had covered everything there was to email and still was small and simple. Later, the essential complexity of email increased. (Essential complexity is the complexity defined by the problem itself .[ [ brooks no silver bullet .]].) Consequently, email systems grew. RFCs started to introduce the concept of mail agents to separate the various roles because they became more extensive and because new roles appeared. As mail system implementations grew, parts of them were split off. For instance, a POP server was included in the original MH; it was removed in nmh. Now is the time to go one step further and split off the MSA and MRA, as well. Not only does this decrease the code size of the project, more importantly, it unburdens mmh of the whole field of message transfer, with all its implications for the project. There is no more need for concern with changes in network transfer. This independence is gained by depending on external components that cover the field. .P In general, functionality can be added in three different ways: .LI 1 By implementing the function in the project itself. .LI 2 By depending on a library that provides the function. .LI 3 By depending on a program that provides the function. .LP While implementing the function in the project itself leads to the largest increase in code size and requires the most maintenance and development work, it keeps the project's dependence on other software lowest. Using libraries or external programs requires less maintenance work but introduces dependencies on external projects. Programs have the smallest interfaces and provide the best separation, but possibly limit the information exchange. External libraries are more strongly connected than external programs, thus information can be exchanged in a more flexible manner. Obviously, adding code to a project increases the maintenance work. As implementing complex functions in the project itself adds a lot of code, this should be avoided if possible. Thus, the dependencies only change in their character, not in their existence. In mmh, library dependencies on .Pn libsasl2 and .Pn libcrypto /\c .Pn libssl were traded against program dependencies on an MSA and an MRA. This also meant trading build-time dependencies against run-time dependencies. Besides providing stronger separation and greater flexibility, program dependencies also allowed over 6\|000 lines of code to be removed from mmh. This made mmh's code base about 12\|% smaller. Reducing the project's code size by such an amount without actually losing functionality is a convincing argument. Actually, as external MSAs and MRAs are likely superior to the project's internal versions, the common user even gains functionality. .P Users of MH should not have problems setting up an external MSA and MRA. Also, the popular MSAs and MRAs have large communities and a lot of available documentation. Choices for MSAs range from small forwarders such as \fIssmtp\fP and \fInullmailer\fP, over mid-size MTAs including \fImasqmail\fP and \fIdma\fP, up to full-featured MTAs as for instance \fIPostfix\fP. MRAs are provided for example by \fIfetchmail\fP, \fIgetmail\fP, \fImpop\fP, and \fIfdm\fP. .H2 "Non-MUA Tools .P One goal of mmh is to remove the tools that do not significantly contribute to the MUA's job. Loosely related and rarely used tools distract from a lean appearance, and require maintenance work without adding much to the core task. By removing these tools, mmh became more streamlined and focused. .BU .Pn conflict was removed .Ci 8b235097cbd11d728c07b966cf131aa7133ce5a9 because it is a mail system maintenance tool and not MUA-related. It even checked .Fn /etc/passwd and .Fn /etc/group for consistency, which is completely unrelated to email. A tool like .Pn conflict is surely useful, but it should not be shipped with mmh. .BU .Pn rcvtty was removed .Ci 14767c94b3827be7c867196467ed7aea5f6f49b0 because its use case of writing to the user's terminal on reception of mail is obsolete. If users like to be informed of new mail, the shell's .Ev MAILPATH variable or graphical notifications are technically more appealing. Writing to terminals directly is hardly ever desired today. If, though, one prefers this approach, the standard tool .Pn write can be used in a way similar to: .VS scan -file - | write `id -un` VE .BU .Pn viamail was removed .Ci eda72d6a7a7c20ff123043fb7f19c509ea01f932 when the new attachment system was activated, because .Pn forw could then cover the task itself. The .Pn sendfiles shell script was rewritten as a wrapper around .Pn forw .Ci 0e82199cf3c991a173e0ac8aa776efdb3ded61e6 . .BU .Pn msgchk was removed .Ci bb9360ead7eb7a3fedcce2eeedfc660014e41dbe , because it lost its use case when POP support was removed. A call to .Pn msgchk provided hardly more information than: .VS ls -l /var/mail/meillo VE Yet, it distinguished between old and new mail, but these details can be retrieved with .Pn stat (1), too. A small shell script could be written to print the information in a similar way, if truly necessary. As mmh's .Pn inc only incorporates mail from the user's local maildrop, and thus no data transfers over slow networks are involved, there is hardly any need to check for new mail before incorporating it. .BU .Pn msh was removed .Ci 916690191222433a6923a4be54b0d8f6ac01bd02 because the tool was in conflict with the philosophy of MH. It provided an interactive shell to access the features of MH. However, it was not just a shell tailored to the needs of mail handling, but one large program that had several MH tools built in. This conflicted with the major feature of MH of being a tool chest. .Pn msh 's main use case had been accessing Bulletin Boards, which have ceased to be popular. .P Removing .Pn msh together with the truly archaic code relics .Pn vmh and .Pn wmh saved more than 7\|000 lines of C code \(en about 15\|% of the project's original source code amount. Having less code \(en with equal readability, of course \(en for the same functionality is an advantage. Less code means less bugs and less maintenance work. As .Pn rcvtty and .Pn msgchk are assumed to be rarely used and can be implemented in different ways, why should one keep them? Removing them streamlined mmh. .Pn viamail 's use case is now partly obsolete and partly covered by .Pn forw , hence there is no reason to still maintain it. .Pn conflict is not related to the mail client, and .Pn msh conflicts with the basic concept of MH. These two tools might still be useful, but they should not be part of mmh. .P .Id slocal Finally, there is .Pn slocal , which is an MDA and thus not directly MUA-related. It should be removed from mmh because including it conflicts with the idea that mmh is an MUA only. However, .Pn slocal provides rule-based processing of messages, like filing them into different folders, which is otherwise not available in mmh. Although .Pn slocal neither pulls in dependencies, nor does it include a separate technical area (cf. Sec. .Cf mail-transfer-facilities ), it still accounts for about 1\|000 lines of code that need to be maintained. As .Pn slocal is almost self-standing, it should be split off into a separate project. This would cut the strong connection between the MUA mmh and the MDA .Pn slocal . For anyone not using MH, .Pn slocal would become yet another independent MDA, like .I procmail . Then .Pn slocal could be installed without a complete MH system. Likewise, mmh users could decide to use .I procmail without having a second, unused MDA, i.e. .Pn slocal , installed. That appears to be conceptionally the best solution. Yet, .Pn slocal is not split off. I defer the decision over .Pn slocal out of a need for deeper investigation. In the meanwhile, it remains part of mmh as its continued existence is not significant; .Pn slocal is unrelated to the rest of the project. .H2 "Displaying Messages .Id mhshow .P Since the very beginning, already in the first concept paper, .[ original memo rand mh shapiro gaines .] .Pn show had been MH's message display program. .Pn show mapped message numbers and sequences to files and invoked .Pn mhl to have the files formatted. With MIME, this approach was not sufficient anymore. MIME messages can consist of multiple parts. Some parts, like binary attachments or text content in foreign charsets, are not directly displayable. .Pn show 's understanding of messages and .Pn mhl 's display capabilities could not cope with the task any longer. .P Instead of extending these tools, additional tools were written from scratch and were added to the MH tool chest. Doing so is encouraged by the tool chest approach. Modular design is a great advantage for extending a system, as new tools can be added without interfering with existing ones. First, the new MIME features were added in form of the single program .Pn mhn . The command .Cl "mhn -show 42 had then shown the message number .Fn 42 , interpreting MIME. With the 1.0 release of nmh in February 1999, Richard Coleman finished the split of .Pn mhn into a set of specialized tools, which together covered the multiple aspects of MIME. One of them was .Pn mhshow , which replaced .Cl "mhn -show" . It was capable of displaying MIME messages appropriately. .P .ZZ From then on, two message display tools were part of nmh, .Pn show and .Pn mhshow . To ease the life of users, .Pn show was extended to automatically hand the job over to .Pn mhshow if displaying the message would be beyond .Pn show 's abilities. In consequence, the user would simply invoke .Pn show (possibly through .Pn next or .Pn prev ) and get the message printed with either .Pn show or .Pn mhshow , whatever was more appropriate. .P Having two similar tools for basically the same task is redundancy. Usually, users do not distinguish between .Pn show and .Pn mhshow in their daily mail reading. Having two separate display programs was therefore unnecessary from a user's point of view. Besides, the development of both programs needed to be in sync, to ensure that the programs behaved in a similar way, because they were used like a single tool. Different behavior would have surprised the user. .P Today, non-MIME messages are rather seen to be a special case of MIME messages, although it is the other way round. As .Pn mhshow already had been able to display non-MIME messages, it appeared natural to drop .Pn show in favor of using .Pn mhshow exclusively .Ci 4c1efddfd499300c7e74263e57d8aa137e84c853 . Removing .Pn show is no loss in function, because .Pn mhshow covers it completely. Yet, the old behavior of .Pn show can still be emulated with the simple command line: .VS mhl `mhpath c` VE .P For convenience, .Pn mhshow was renamed to .Pn show after .Pn show was gone. It is clear that such a rename may confuse future developers when trying to understand the history. Nevertheless, I consider the convenience on the user's side, to outweigh the inconvenience for understanding the evolution of the tools. .P To prepare for the transition, .Pn mhshow was reworked to behave more like .Pn show first (cf. Sec. .Cf mhshow ). Once the tools behaved more alike, the replacing appeared to be even more natural. Today, mmh's new .Pn show has become the one single message display program once again, with the difference that today it handles MIME messages as well as non-MIME messages. The outcomes of the transition are one program less to maintain, no second display program for users to deal with, and less system complexity. .P Still, removing the old .Pn show hurts in one regard: It had been such a simple program. Its lean elegance is missing from the new .Pn show , but there is no alternative; supporting MIME demands higher essential complexity. .H2 "Configure Options .P Customization is a double-edged sword. It allows better suiting setups, but not for free. There is the cost of code complexity to be able to customize. There is the cost of less tested setups, because there are more possible setups and especially corner cases. Steve Johnson confirms: .[ [ eric raymond the art of unix programming .], p. 233] .QS Unless it is done very carefully, the addition of an on/off configuration option can lead to a need to double the amount of testing. Since in practice one never does double the amount of testing, the practical effect is to reduce the amount of testing that any given configuration receives. Ten options leads to 1024 times as much testing, and pretty soon you are talking real reliability problems. .QE .LP Additionally, there is the cost of choice itself. The code complexity directly affects the developers. Less tested code affects both users and developers. The problem of choice affects the users, for once by having to choose but also by more complex interfaces that require more documentation. Whenever options add few advantages but increase the complexity of the system, they should be considered for removal. I have reduced the number of project-specific configure options from 15 to 3. .U3 "Mail Transfer Facilities .P With the removal of the mail transfer facilities (cf. Sec. .Cf mail-transfer-facilities ) five configure options vanished: .P The switches .Sw --with-tls and .Sw --with-cyrus-sasl had activated the support for transfer encryption and authentication. They are not needed anymore. .Ci fecd5d34f65597a4dfa16aeabea7d74b191532c3 .Ci 156d35f6425bea4c1ed3c4c79783dc613379c65b .P The configure switch .Sw --enable-pop had activated the message retrieval facility. Whereas the code area that had been conditionally compiled in for TLS and SASL support was small, the conditionally compiled code area for POP support was much larger. The code base had only changed slightly on toggling TLS or SASL support but it had changed much on toggling POP support. The changes in the code base could hardly be overviewed. By having POP support togglable, a second code base had been created, one that needed to be tested. This situation is basically similar for the conditional TLS and SASL code, but there the changes are minor and can yet be overviewed. Still, conditional compilation of a code base creates variations of the original program. More variations require more testing and maintenance work. .P Two other options had only specified default configuration values: .Sw --with-mts defined the default transport service .Ci f6aa95b724fd8c791164abe7ee5468bf5c34f226 . With .Sw --with-smtpservers default SMTP servers could be set .Ci 128545e06224233b7e91fc4c83f8830252fe16c9 . Both of them became irrelevant when the SMTP transport service was removed. In mmh, all messages are handed over to .Pn sendmail for transportation. .U3 "Backup Prefix .P The backup prefix is the string that was prepended to message filenames to tag them as deleted. By default it had been the comma character (`\fL,\fP'). In July 2000, Kimmo Suominen introduced the configure option .Sw --with-hash-backup to change the default to the hash character `\f(CW#\fP'. This choice was probably personal preference, but, being related or not, words that start with the hash character introduce a comment in the Unix shell. Thus, the command line .Cl "rm #13 #15 calls .Pn rm without arguments because the first hash character starts a comment that reaches until the end of the line. To delete the backup files, .Cl "rm ./#13 ./#15" needs to be used. Thus, using the hash as backup prefix may be seen as a precaution against backup loss. .P First, I removed the configure option but added the profile entry .Pe Backup-Prefix , which allowed to specify an arbitrary string as backup prefix .Ci 6c40d481d661d532dd527eaf34cebb6d3f8ed086 . This change did not remove the choice but moved it to a location where it suited better, in my eyes. .P Eventually however, the new trash folder concept (cf. Sec. .Cf trash-folder ) removed the need for the backup prefix completely. .Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173 .Ci ca0b3e830b86700d9e5e31b1784de2bdcaf58fc5 .U3 "Editor and Pager .Id editor-pager .P The two configure options .CW --with-editor=EDITOR .CW --with-pager=PAGER were used to specify the default editor and pager at configure time. Doing so at configure time made sense in the eighties, when the set of available editors and pagers varied much across different systems. Today, the situation is more homogeneous. The programs .Pn vi and .Pn more can be expected to be available on every Unix system, as they are specified by POSIX since two decades. (The specifications for .Pn vi and .Pn more appeared in .[ posix 1987 .] and, .[ posix 1992 .] respectively.) As a first step, these two tools were hard-coded as defaults .Ci 5d43a99db70c12a673028c7758c20cbe3e13ef5f . Not changed were the .Pe editor and .Pe moreproc profile entries, which allowed the user to override the system defaults. Later, the concept was reworked again to respect the standard environment variables .Ev VISUAL and .Ev PAGER if they are set. Today, mmh determines the editor to use in the following order, taking the first available and non-empty item .Ci f85f4b7ae62e3d05a945dcd46ead51f0a2a89a9b : .LI 1 Environment variable .Ev MMHEDITOR .LI 2 Profile entry .Pe Editor .LI 3 Environment variable .Ev VISUAL .LI 4 Environment variable .Ev EDITOR .LI 5 Command .Pn vi . .LP The pager to use is determined in a similar order .Ci 0c4214ea2aec6497d0d67b436bbee9bc1d225f1e : .LI 1 Environment variable .Ev MMHPAGER .LI 2 Profile entry .Pe Pager (replaces .Pe moreproc ) .LI 3 Environment variable .Ev PAGER .LI 4 Command .Pn more . .LP By respecting the .Ev VISUAL /\c .Ev EDITOR and .Ev PAGER environment variables, the new behavior complies with the common style on Unix systems. It is more uniform and clearer for users. .U3 "ndbm .P .Pn slocal used to depend on the database library .I ndbm . The database is used to store the .Hd Message-ID header field values of all messages delivered. This enabled .Pn slocal to suppress delivering the same message to the same user twice. This features was enabled by the .Sw -suppressdup switch. .P Because a variety of versions of the database library exist, .[ wolter unix incompat notes dbm .] complicated autoconf code was needed to detect them correctly. Furthermore, the configure switches .Sw --with-ndbm=ARG and .Sw --with-ndbmheader=ARG were added to help with difficult setups that would not be detected automatically or not correctly. .P By removing the suppress duplicates feature of .Pn slocal , the dependency on .I ndbm vanished and 120 lines of complex autoconf code could be saved .Ci ecd6d6a20cb7a1507e3a20d6c4cb3a1cf14c6bbf . The change removed functionality but that is considered minor to the improvement of dropping the dependency and the complex autoconf code. .U3 "MH-E Support .P The configure option .Sw --disable-mhe was removed when the MH-E support was reworked. MH-E is the Emacs front-end to MH. It requires MH to provide minor additional functions. The .Sw --disable-mhe configure option had switched off these extensions. After removing the support for old versions of MH-E, only the .Sw -build switches of .Pn forw and .Pn repl are left to be MH-E extensions. They are now always built in because they add little code and complexity. In consequence, the .Sw --disable-mhe configure option was removed .Ci a7ce7b4a580d77b6c2c4d980812beb589aa4c643 . Dropping the option also removed a variant of the code base that would have needed to be tested. This change was undertaken in January 2012 in nmh and thereafter merged into mmh. .U3 "Masquerading .P The configure option .Sw --enable-masquerade could take up to three arguments: .CW draft_from , .CW mmailid , and .CW username_extension . They activated different types of address masquerading. All of them were implemented in the SMTP-speaking .Pn post command. Address masquerading is an MTA's task and mmh does not cover this field anymore. Hence, true masquerading needs to be implemented in the external MTA. .P The .I mmailid masquerading type is the oldest one of the three and the only one available in the original MH. It provided a .I username to .I fakeusername mapping, based on the .Fn passwd 's GECOS field. Nmh's man page .Mp mh-tailor (5) described the use case as being the following: .QS This is useful if you want the messages you send to always appear to come from the name of an MTA alias rather than your actual account name. For instance, many organizations set up `First.Last' sendmail aliases for all users. If this is the case, the GECOS field for each user should look like: ``First [Middle] Last <First.Last>'' .QE .P As mmh sends outgoing mail via the local MTA only, the best location to do such global rewrites is there. Besides, the MTA is conceptionally the right location because it does the reverse mapping for incoming mail (aliasing), too. Furthermore, masquerading set up there is readily available for all mail software on the system. Hence, mmailid masquerading was removed. .Ci 0836c8000ccb34b59410ef1c15b1b7feac70ce5f .P The .I username_extension masquerading type did not replace the username but would append a suffix, specified by the .Ev USERNAME_EXTENSION environment variable, to it. This provided support for the .I user-extension feature of qmail .[ [ sill qmail handbook .], p. 141] and the similar .I "plussed user processing of Sendmail. .[ [ sendmail costales .], p. 476] The decision to remove this username_extension masquerading was motivated by the fact that .Pn spost had not supported it yet. Username extensions can be used in mmh, but less convenient. .\" XXX In the format file: %(getenv USERNAME_EXTENSION) .Ci 2abae0bfd0ad5bf898461e50aa4b466d641f23d9 .P The .I draft_from masquerading type instructed .Pn post to use the value of the .Hd From header field as SMTP envelope sender. Sender addresses could be replaced completely. Mmh offers a kind of masquerading similar in effect, but with technical differences. As mmh does not transfer messages itself, the local MTA has final control over the sender's address. Any masquerading mmh introduces may be reverted by the MTA. In times of pedantic spam checking, an MTA will take care to use sensible envelope sender addresses to keep its own reputation up. Nonetheless, the MUA can set the .Hd From header field and thereby propose a sender address to the MTA. The MTA may then decide to take that one or generate the canonical sender address for use as envelope sender address. .Ci b14ea6073f77b4359aaf3fddd0e105989db9 .P In mmh, the MTA will always extract the recipient and sender from the message header (\c .Pn sendmail 's .Sw -t switch). The .Hd From header field of the draft may be set arbitrary by the user. If it is missing, the canonical sender address will be generated by the MTA. .U3 "Remaining Options .P Two configure options remain in mmh. One of them is the file locking method to use: .Sw --with-locking=[dot|fcntl|flock|lockf] . The idea of removing all methods except the portable .I "dot locking and having that one as the default is appealing, but this change requires deeper technical investigation into the topic. The other remaining option, .Sw --enable-debug , compiles the programs with debugging symbols. This option is likely to stay. .H2 "Command Line Switches .P The command line switches of MH tools are similar in style to the switches in the X Window System. They consist of a single dash (`\fL-\fP') followed by a word. For example .Cl -truncate . To ease typing, the switch can be abbreviated, given the remaining prefix is unambiguous. If no other switch starts with the letter `t', then any of .Cl "-truncate" , .Cl "-trunc" , .Cl "-tr" , and .Cl "-t is equal. As a result, switches can neither be grouped (as in .Cl "ls -ltr" ) nor can switch arguments be appended directly to the switch itself (as in .Cl "sendmail -q30m" ). Many switches have negating counter-parts, which start with `no'. For example .Cl "-notruncate inverts the .Cl "-truncate switch. They exist to override the effect of default switches in the profile. Every program in mmh has two generic switches: .Sw -help , to print a short message on how to use the program, and .Sw -Version (with capital `V'), to tell what version of mmh the program belongs to. .P Switches change the behavior of programs. Programs that do one thing in one way require no switches. In most cases, doing something in exactly one way is too limiting. If one task should be accomplished in various ways, switches are a good approach to alter the behavior of a program. Changing the behavior of programs provides flexibility and customization to users, but at the same time it complicates the code, the documentation, and the usage of the program. Therefore, the number of switches should be kept small. A small set of well-chosen switches is best. Usually, the number of switches increases over time. Already in 1985, Rose and Romine have identified this as a major problem of MH: .[ [ rose romine real work .], p. 12] .QS A complaint often heard about systems which undergo substantial development by many people over a number of years, is that more and more options are introduced which add little to the functionality but greatly increase the amount of information a user needs to know in order to get useful work done. This is usually referred to as creeping featurism. .QP Unfortunately MH, having undergone six years of off-and-on development by ten or so well-meaning programmers (the present authors included), suffers mightily from this. .QE .P Being reluctant to adding new switches (or \fIoptions\fP, as Rose and Romine call them) is one part of a counter-action, the other part is removing hardly used switches. Nmh's tools have lots of switches already implemented. Hence, cleaning up by removing some of them was the more important part of the counter-action. Removing existing functionality is always difficult because it breaks programs that use these functions. Also, for every obsolete feature, there'll always be someone who still uses it and thus opposes its removal. This puts the developer into the position, where sensible improvements to style are regarded as destructive acts. Yet, living with the featurism is far worse, in my eyes, because future needs will demand adding further features, worsening the situation more and more. Rose and Romine added in a footnote, ``[...] .Pn send will no doubt acquire an endless number of switches in the years to come'' .[ [ rose romine real work .], p. 12]. Although clearly humorous, the comment points to the nature of the problem. Refusing to add any new switches would encounter the problem at its root, but this is not practical. New needs will require new switches and it would be unwise to block them strictly. Nevertheless, removing obsolete switches still is an effective approach to deal with the problem. Working on an experimental branch without an established user base, eased my work because I did not offend users when I removed existing functions. .P Rose and Romine counted 24 visible and 9 more hidden switches for .Pn send . In nmh, they increased up to 32 visible and 12 hidden ones. At the time of writing, no more than 4 visible switches and 1 hidden switch have remained in mmh's .Pn send . These numbers include the two generic switches, .Sw -help and .Sw -Version . .P Hidden switches are ones not documented. In mmh, 12 tools have hidden switches. 9 of them are .Sw -debug switches, the other 6 provide special interfaces for internal use. .P The following figure displays the number of switches for each of the tools that is available in both nmh and mmh. The tools are sorted by the number of switches they had in nmh. Both visible and hidden switches were counted, but not the generic help and version switches. Whereas in the beginning of the project, the average tool had 11 switches, now it has no more than 5 \(en only half as many. If the `no' switches and similar inverse variant are folded onto their counter-parts, the average tool had 8 switches in pre-mmh times and has 4 now. The total number of functional switches in mmh dropped from 465 to 233. .sp .KS .in 1c .so input/switches.grap .KE .sp .P .ZZ A part of the switches vanished after functions were removed. This was the case for network mail transfer, for instance. Sometimes, however, the work flow was the other way: I looked through the .Mp mh-chart (7) man page to identify the tools with apparently too many switches. Then I valued the benefit of each switch by examining the tool's man page and source code, aided by literature research and testing. .U3 "Draft Folder Facility .P A change early in the project was the complete transition from the single draft message to the draft folder facility .Ci 337338b404931f06f0db2119c9e145e8ca5a9860 (cf. Sec. .Cf draft-folder ). The draft folder facility was introduced in the mid-eighties, when Rose and Romine called it a ``relatively new feature''. .[ rose romine real work .] Since then, the facility was included, inactive by default. By making it permanently active and by related rework of the tools, the .Sw -[no]draftfolder , and .Sw -draftmessage switches could be removed from .Pn comp , .Pn repl , .Pn forw , .Pn dist , .Pn whatnow , and .Pn send .Ci 337338b404931f06f0db2119c9e145e8ca5a9860 . The only flexibility lost with this change is having multiple draft folders within one profile. I consider this a theoretical problem only. At the same time, the .Sw -draft switch of .Pn anno , .Pn refile , and .Pn send was removed. The special treatment of \fIthe\fP draft message became irrelevant after the rework of the draft system (cf. Sec. .Cf draft-folder ). .U3 "In Place Editing .P .Pn anno had the switches .Sw -[no]inplace to either annotate the message in place and thus preserve hard links, or annotate a copy to replace the original message. The latter approach broke hard links. Following the assumption that linked messages should truly be the same message and annotating it should not break the link, the .Sw -[no]inplace switches were removed and the previous default .Sw -inplace was made the definitive behavior .Ci c8195849d2e366c569271abb0f5f60f4ebf0b4d0 . The .Sw -[no]inplace switches of .Pn repl , .Pn forw , and .Pn dist could be removed, as well, as they were simply passed through to .Pn anno . .P .Pn burst also had .Sw -[no]inplace switches, but with a different meaning. With .Sw -inplace , the digest had been replaced by the table of contents (i.e. the introduction text) and the burst messages were placed right after this message, renumbering all following messages. Also, any trailing text of the digest was lost, though, in practice, it usually consists of an end-of-digest marker only. Nonetheless, this behavior appeared less elegant than the .Sw -noinplace behavior, which already had been the default. Nmh's .Mp burst (1) man page reads: .QS If .Sw -noinplace is given, each digest is preserved, no table of contents is produced, and the messages contained within the digest are placed at the end of the folder. Other messages are not tampered with in any way. .QE .LP The decision to drop the .Sw -inplace behavior was supported by the code complexity and the possible data loss it caused. .Sw -noinplace was chosen to be the definitive behavior. .Ci 68a686adeb39223a5e1ad35e4a24890ec053679d .U3 "Forms and Format Strings .P Historically, the tools that had .Sw -form switches to supply a form file had .Sw -format switches as well to supply the contents of a form file as a string on the command line directly. In consequence, the following two lines equaled: .VS scan -form scan.mailx scan -format "`cat /path/to/scan.mailx`" VE The .Sw -format switches were dropped in favor for extending the .Sw -form switches .Ci f51956be123db66b00138f80464d06f030dbb88d . If their argument starts with an equal sign (`\fL=\fP'), then the rest of the argument is taken as a format string, otherwise the arguments is treated as the name of a format file. Thus, now the following two lines equal: .VS scan -form scan.mailx scan -form "=`cat /path/to/scan.mailx`" VE This rework removed the prefix collision between .Sw -form and .Sw -format . Typing `\fL-fo\fP' is sufficient to specify form file or format string. .P The different meaning of .Sw -format for .Pn forw and .Pn repl was removed in mmh. .Pn forw was completely switched to MIME-type forwarding, thus removing the .Sw -[no]format .Ci 6e271608b7b9c23771523f88d23a4d3593010cf1 . For .Pn repl , the .Sw -[no]format switches were reworked to .Sw -[no]filter switches .Ci 67411b1f95d6ec987b4c732459e1ba8a8ac192c6 . The .Sw -format switches of .Pn send and .Pn post , which had a third meaning, were removed likewise .Ci f3cb7cde0e6f10451b6848678d95860d512224b9 . Eventually, the ambiguity of the .Sw -format switches is resolved by not having such switches anymore in mmh. .U3 "MIME Tools .P The MIME tools, which once were part of .Pn mhn (whatever that stood for), had several switches that added little practical value to the programs. The .Sw -[no]realsize switches of .Pn mhbuild and .Pn mhlist were removed .Ci 8d8f1c3abc586c005c904e52c4adbfe694d2201c . Real size calculations are done always now because nmh's .Mp mhbuild (1) man page states that ``This provides an accurate count at the expense of a small delay'' with the small delay not being noticeable on modern systems. .P The .Sw -[no]check switches were removed together with the support for .Hd Content-MD5 header fields [RFC\|1864] (cf. Sec. .Cf content-md5 ) .Ci 31dc797eb5178970d68962ca8939da3fd9a8efda . .P The .Sw -[no]ebcdicsafe and .Sw -[no]rfc934mode switches of .Pn mhbuild were removed because they are considered obsolete .Ci 01a3480928da485b4d6109d36d751dfa71799d58 .Ci 3363e2624dce0eb8164cf8b3f1ab385c8ff72e88 . .P Content caching of external MIME parts, activated with the .Sw -rcache and .Sw -wcache switches was completely removed .Ci d1fefd9f614e4dc3cda16da6c69133c1b2005269 . External MIME parts are rare today, having a caching facility for them appears to be unnecessary. .P In pre-MIME times, .Pn mhl had covered many tasks that are part of MIME handling today. Therefore, .Pn mhl could be simplified to a large extend, reducing the number of its switches from 21 to 6 .Ci 350ad6d3542a07639213cf2a4fe524e829c1e7b6 .Ci 0e46503be3c855bddaeae3843e1b659279c35d70 . .U3 "Header Printing .P .Pn folder 's data output is self-explaining enough that displaying the header line makes little sense. Hence, the .Sw -[no]header switch was removed and headers are never printed .Ci 601cc73d1fa05ce96faa728f036d6c51b91701c7 . .P In .Pn mhlist , the .Sw -[no]header switches were removed, as well .Ci b24f96523aaf60e44e04a3ffb1d22e69a13a602f . In this case, the headers are printed always because the output is not self-explaining. .P .Pn scan also had .Sw -[no]header switches. Printing this header had been sensible until the introduction of format strings made it impossible to display column headings. Only the folder name and the current date remained to be printed. As this information can be perfectly generated with .Pn folder and .Pn date , the switches were removed .Ci c477dc5d1d03fa6d9a8ab3dd3508c63cbddc044e . .P By removing all .Sw -header switches, the collision with .Sw -help on the first two letters was resolved. Currently, .Sw -h evaluates to .Sw -help for all tools of mmh. .U3 "Suppressing Edits or the Invocation of the WhatNow Shell .P The .Sw -noedit switch of .Pn comp , .Pn repl , .Pn forw , .Pn dist , and .Pn whatnow was removed and replaced by the ability to specify .Sw -editor with an empty argument .Ci 75fca31a5b9d5c1a99c74ab14c94438d8852fba9 . (Using .Cl "-editor /bin/true is nearly the same. It differs only in setting the previous editor.) .P The more important change is the removal of the .Sw -nowhatnowproc switch .Ci ee4f43cf2ef0084ec698e4e87159a94c01940622 . This switch had once introduced an awkward behavior, as explained in nmh's man page for .Mp comp (1): .QS The .Sw -editor .Ar editor switch indicates the editor to use for the initial edit. Upon exiting from the editor, .Pn comp will invoke the .Pn whatnow program. See .Mp whatnow (1) for a discussion of available options. The invocation of this program can be inhibited by using the .Sw -nowhatnowproc switch. (In truth of fact, it is the .Pn whatnow program which starts the initial edit. Hence, .Sw \%-nowhatnowproc will prevent any edit from occurring.) .QE .P Effectively, the .Sw -nowhatnowproc switch caused only only a draft message to be created. As .Cl "-whatnowproc /bin/true does the same, the .Sw -nowhatnowproc switch was removed for being redundant. .U3 "Various .BU With the removal of MMDF maildrop format support, .Pn packf and .Pn rcvpack no longer needed the .Sw -mbox and .Sw -mmdf switches. The behavior of .Sw -mbox became the definitive behavior .Ci 3916ab66ad5d183705ac12357621ea8661afd3c0 . Further rework in both tools made the .Sw -file switch unnecessary .Ci ca1023716d4c2ab890696f3e41fa0d94267a940e . .BU Mmh's tools do no longer clear the screen (\c .Pn scan 's and .Pn mhl 's .Sw -[no]clear switches .Ci e57b17343dcb3ff373ef4dd089fbe778f0c7c270 .Ci 943765e7ac5693ae177fd8d2b5a2440e53ce816e ). The message formating tool .Pn mhl does neither ring the bell (\c .Sw -[no]bell .Ci e11983f44e59d8de236affa5b0d0d3067c192e24 ) nor does it page the output itself (\c .Sw -length .Ci 5b9d883db0318ed2b84bb82dee880d7381f99188 ) anymore. Generally, the pager to use is no longer specified with the .Sw -[no]moreproc command line switches for .Pn mhl and .Pn show /\c .Pn mhshow .Ci 39e87a75b5c2d3572ec72e717720b44af291e88a . .BU In order to avoid prefix collisions among switch names, the .Sw -version switch was renamed to .Sw -Version (with capital `V') .Ci 32b2354dbaf4bf934936eb5b102a4a3d2fdd209a . Every program has the .Sw -version switch but its first three letters collided with the .Sw -verbose switch, present in many programs. The rename solved this problem once for all. Although this rename breaks a basic interface, having the .Sw -V abbreviation to display the version information, isn't all too bad. .BU .Sw -[no]preserve of .Pn refile was removed .Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173 because what use was it anyway? Quoting nmh's man page .Mp refile (1): .QS Normally when a message is refiled, for each destination folder it is assigned the number which is one above the current highest message number in that folder. Use of the .Sw -preserv [sic!] switch will override this message renaming, and try to preserve the number of the message. If a conflict for a particular folder occurs when using the .Sw -preserve switch, then .Pn refile will use the next available message number which is above the message number you wish to preserve. .QE .BU The removal of the .Sw -[no]reverse switches of .Pn scan .Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173 is a bug fix. This is supported by the comments ``\-[no]reverse under #ifdef BERK (I really HATE this)'' by Rose and ``Lists messages in reverse order with the `\-reverse' switch. This should be considered a bug'' by Romine in the changelogs. The question remains why neither Rose nor Romine have fixed this bug in the eighties when they wrote these comments. .\" -------------------------------------------------------------- .H1 "Modernizing .P In the more than thirty years of MH's existence, its code base was increasingly extended. New features entered the project and became alternatives to the existing behavior. Relics from several decades have gathered in the code base but seldom obsolete features were dropped. This section describes the removing of old code and the modernizing of the default setup. It focuses on the functional aspect only; the non-functional aspects of code style are discussed in Sec. .Cf code-style . .H2 "Code Relics .P My position regarding the removal of obsolete code is much more revolutional than the nmh community appreciates. Working on an experimental version, I was able to quickly drop functionality that I considered ancient. The need for consensus with peers would have slowed this process down. Without the need to justify my decisions, I was able to rush forward. .P In December 2011, Paul Vixie motivated the nmh developers to just do the work: .[ paul vixie edginess nmh-workers .] .QS let's stop walking on egg shells with this code base. there's no need to discuss whether to keep using vfork, just note in [sic!] passing, [...] we don't need a separate branch for removing vmh or ridding ourselves of #ifdef's or removing posix replacement functions or depending on pure ansi/posix ``libc''. .QP these things should each be a day or two of work and the ``main branch'' should just be modern. [...] let's push forward, aggressively. .QE .LP I did so already in the months before. I pushed forward. I simply dropped the cruft. .P The decision to drop a feature was based on literature research and careful thinking, but whether having had contact with this particular feature within my own computer life served as a rule of thumb. I explained my reasons in the commit messages in the version control system. Hence, others can comprehend my view and argue for undoing the change if I have missed an important aspect. I was quick in dropping parts. I rather include falsely dropped parts again, than going at a slower pace. Mmh is experimental work; it requires tough decisions. .U3 "Process Forking .P Being a tool chest, MH creates many processes. In earlier times .Fu fork() had been an expensive system call, because the process's image needed to be completely duplicated at once. This expensive work was especially unnecessary in the commonly occurring case wherein the image is replaced by a call to .Fu exec() right after having forked the child process. The .Fu vfork() system call was invented to speed up this particular case. It completely omits the duplication of the image. On old systems this resulted in significant speed ups. Therefore MH used .Fu vfork() whenever possible. .P Modern memory management units support copy-on-write semantics, which make .Fu fork() almost as fast as .Fu vfork() . The man page of .Mp vfork (2) in FreeBSD 8.0 states: .QS This system call will be eliminated when proper system sharing mechanisms are implemented. Users should not depend on the memory sharing semantics of vfork() as it will, in that case, be made synonymous to fork(2). .QE .LP Vixie supports the removal with the note that ``the last system on which fork was so slow that an mh user would notice it, was Eunice. that was 1987''. .[ nmh-workers vixie edginess .] I replaced all calls to .Fu vfork() with calls to .Fu fork() .Ci 40821f5c1316e9205a08375e7075909cc9968e7d . .P Related to the costs of .Fu fork() is the probability of its success. In the eighties, on heavy loaded systems, calls to .Fu fork() were prone to failure. Hence, many of the .Fu fork() calls in the code were wrapped into loops to retry the .Fu fork() several times, to increase the chances to succeed eventually. On modern systems, a failing .Fu fork() call is unusual. Hence, in the rare case when .Fu fork() fails, mmh programs simply abort .Ci 5fbf37ee68e018998ada61eeab73e035b26834b6 . .U3 "Header Fields .BU The .Hd Encrypted header field was introduced by RFC\|822, but already marked as legacy in RFC\|2822. Today, OpenPGP provides the basis for standardized exchange of encrypted messages [RFC\|4880, RFC\|3156]. Hence, the support for .Hd Encrypted header fields is removed in mmh .Ci 064527f7b57ab050e5af13e15ad99aeeab125857 . .BU The native support for .Hd Face header fields has been removed, as well .Ci 8e5be81f784682822f5e868c1bf3c8624682bd23 . This feature is similar to the .Hd X-Face header field in its intent, but takes a different approach to store the image. Instead of encoding the image data directly into the header field, it contains the hostname and UDP port where the image date can be retrieved. There is even a third Face system, which is the successor of .Hd X-Face , although it re-uses the .Hd Face header field name. It was invented in 2005 and supports colored PNG images. None of the Face systems described here is popular today. Hence, mmh has no direct support for them. .BU .Id content-md5 The .Hd Content-MD5 header field was introduced by RFC\|1864. It provides detection of data corruption during the transfer. But it can not ensure verbatim end-to-end delivery of the contents [RFC\|1864]. The proper approach to verify content integrity in an end-to-end relationship is the use of digital signatures [RFC\|4880]. On the other hand, transfer protocols should detect corruption during the transmission. The TCP includes a checksum field therefore. These two approaches in combinations render the .Hd Content-MD5 header field superfluous. Not a single one out of 4\|200 messages from two decades in the nmh-workers mailing list archive .[ nmh-workers mailing list archive website .] contained a .Hd Content-MD5 header field. Neither did any of the 60\|000 messages in my personal mail storage. Removing the support for this header field .Ci 31dc797eb5178970d68962ca8939da3fd9a8efda , removed the last place where MD5 computation was needed. Hence, the MD5 code could be removed as well. Over 500 lines of code vanished by this one change. .U3 "MMDF maildrop support .P This type of maildrop format is conceptionally similar to the mbox format, but uses a different message delimiter (`\fL\\1\\1\\1\\1\fP', commonly written as `\fL^A^A^A^A\fP', instead of `\fLFrom\0\fP'). Mbox is the de-facto standard maildrop format on Unix, whereas the MMDF maildrop format is now forgotten. Mbox remains as the only packed mailbox format, supported in mmh. .P The simplifications within the code were moderate. Mainly, the reading and writing of MMDF mailbox files was removed. But also, switches of .Pn packf and .Pn rcvpack could be removed .Ci 3916ab66ad5d183705ac12357621ea8661afd3c0 . In the message parsing function .Fn sbr/m_getfld.c , knowledge of MMDF packed mail boxes was removed .Ci 684ec30d81e1223a282764452f4902ed4ad1c754 . Further code structure simplifications may be possible there, because only one single packed mailbox format is left to be supported. I have not worked on them yet because .Fu m_getfld() is heavily optimized and thus dangerous to touch. The risk of damaging the intricate workings of the optimized code is too high. .U3 "Prompter's Control Keys .P The program .Pn prompter queries the user to fill in a message form. When used as .Cl "comp -editor prompter" , the resulting behavior is similar to .Pn mailx . Apparently, .Pn prompter had not been touched lately. Otherwise it's hardly explainable why it still offered the switches .Sw -erase .Ar chr and .Sw -kill .Ar chr to name the characters for command line editing. The times when this had been necessary are long time gone. Today these things work out-of-the-box, and if not, are configured with the standard tool .Pn stty . The switches are removed now .Ci 0bd9750710cdbab80cfb4036dd87af20afe1552f . .U3 "Hardcopy Terminal Support .P More of a funny anecdote is a check for being connected to a hardcopy terminal. It remained in the code until spring 2012, when I finally removed it .Ci b7764c4a6b71d37918a97594d866258f154017ca . The check only prevented a pager to be placed between the printing program (\c .Pn mhl ) and the terminal. In nmh, this could have been ensured statically with the .Sw -nomoreproc at the command line, too. In mmh, setting the profile entry .Pe Pager or the environment variable .Ev PAGER to .Pn cat is sufficient. .H2 "Attachments .P The mind model of email attachments is unrelated to MIME. Although the MIME RFCs [RFC\|2045\(en2049] define the technical requirements for having attachments, they do not mention the term. Instead of attachments, MIME talks about ``multi-part message bodies'' [RFC\|2045], a more general concept. Multi-part messages are messages ``in which one or more different sets of data are combined in a single body'' [RFC\|2046]. MIME keeps its descriptions generic; it does not imply specific usage models. Today, one usage model is prevalent: attachments. The idea is having a main text document with files of arbitrary kind attached to it. In MIME terms, this is a multi-part message having a text part first and parts of arbitrary type following. .P .ZZ MH's MIME support is a direct implementation of the RFCs. The perception of the topic described in the RFCs is clearly visible in MH's implementation. As a result, MH had all the MIME features but no idea of attachments. But users do not need all the MIME features, they want convenient attachment handling. .U3 "Composing MIME Messages .P In order to improve the situation on the message composing side, Jon Steinhart had added an attachment system to nmh in 2002 .Ci 7480dbc14bc90f2d872d434205c0784704213252 . In the file .Fn docs/README-ATTACHMENTS , he described his motivation to do so: .QS Although nmh contains the necessary functionality for MIME message handing [sic!], the interface to this functionality is pretty obtuse. There's no way that I'm ever going to convince my partner to write .Pn mhbuild composition files! .QE .LP With this change, the mind model of attachments entered nmh. In the same document: .QS These changes simplify the task of managing attachments on draft files. They allow attachments to be added, listed, and deleted. MIME messages are automatically created when drafts with attachments are sent. .QE .LP Unfortunately, the attachment system, like every new facilities in nmh, was inactive by default. .P During my time in Argentina, I tried to improve the attachment system. But, after long discussions my patch died as a proposal on the mailing list because of great opposition in the nmh community. .[ nmh-workers attachment proposal .] In January 2012, I extended the patch and applied it to mmh .Ci 8ff284ff9167eff8f5349481529332d59ed913b1 . In mmh, the attachment system is active by default. Instead of command line switches, the .Pe Attachment-Header profile entry is used to specify the name of the attachment header field. It is pre-defined to .Hd Attach . .P To add an attachment to a draft, a header line needs to be added: .VS To: bob Subject: The file you wanted Attach: /path/to/the/file-bob-wanted -------- Here it is. VE The header field can be added to the draft manually in the editor, or by using the `attach' command at the WhatNow prompt, or non-interactively with .Pn anno : .VS anno -append -nodate -component Attach -text /path/to/attachment VE Drafts with attachment headers are converted to MIME automatically by .Pn send . The conversion to MIME is invisible to the user. The draft stored in the draft folder is always in source form with attachment headers. If the MIMEification fails (e.g. because the file to attach is not accessible) the original draft is not changed. .P The attachment system handles the forwarding of messages, too. If the attachment header value starts with a plus character (`\fL+\fP'), like in .Cl "Attach: +bob 30 42" , the given messages in the specified folder will be attached. This allowed to simplify .Pn forw .Ci f41f04cf4ceca7355232cf7413e59afafccc9550 . .P Closely related to attachments is non-ASCII text content, because it requires MIME as well. In nmh, the user needed to call `mime' at the WhatNow prompt to have the draft converted to MIME. This was necessary whenever the draft contained non-ASCII characters. If the user did not call `mime', a broken message would be sent. Therefore, the .Pe automimeproc profile entry could be specified to have the `mime' command invoked automatically each time. Unfortunately, this approach conflicted with the attachment system because the draft would already be in MIME format at the time when the attachment system wanted to MIMEify it. To use nmh's attachment system, `mime' must not be called at the WhatNow prompt and .Pe automimeproc must not be set in the profile. But then the case of non-ASCII text without attachment headers was not caught. All in all, the solution was complex and irritating. My patch from December 2010 .[ nmh-workers attachment proposal .] would have simplified the situation. .P Mmh's current solution is even more elaborate. Any necessary MIMEification is done automatically. There is no `mime' command at the WhatNow prompt anymore. The draft will be converted automatically to MIME when either an attachment header or non-ASCII text is present. Furthermore, the hash character (`\fL#\fP') is not special any more at line beginnings in the draft message. Users need not concern themselves with the whole topic at all. The approach taken in mmh is tailored towards today's most common case: a text part, possibly with attachments. This case was simplified. .P Although the new approach does not anymore support arbitrary MIME compositions directly, the full power of .Pn mhbuild can still be accessed. Given no attachment headers are included, users can create .Pn mhbuild composition drafts like in nmh. Then, at the WhatNow prompt, they can invoke .Cl "edit mhbuild to convert the draft to MIME. Because the resulting draft neither contains non-ASCII characters nor has it attachment headers, the attachment system will not touch it. .U3 "MIME Type Guessing .P From the programmer's point of view, the use of .Pn mhbuild composition drafts had one notable advantage over attachment headers: The user provides the appropriate MIME types for files to include. The new attachment system needs to find out the correct MIME type itself. This is a difficult task. Determining the correct MIME type of content is partly mechanical, partly intelligent work. Forcing the user to find out the correct MIME type, forces him to do partly mechanical work. Letting the computer do the work can lead to bad choices for difficult content. For mmh, the latter option was chosen to spare the user the work .Ci 3baec236a39c5c89a9bda8dbd988d643a21decc6 . .P Determining the MIME type by the suffix of the file name is a dumb approach, yet it is simple to implement and provides good results for the common cases. If no MIME type can be determined, text content is sent as `text/plain', anything else under the generic fall-back type `application/octet-stream'. Mmh implements this approach in the .Pn print-mimetype script .Ci 4b5944268ea0da7bb30598a27857304758ea9b44 . .P A far better, though less portable, approach is the use of .Pn file . This standard tool tries to determine the type of files. Unfortunately, its capabilities and accuracy varies from system to system. Additionally, its output was only intended for human beings, but not to be used by programs. Nevertheless, modern versions of GNU .Pn file , which are prevalent on the popular GNU/Linux systems, provide MIME type output in machine-readable form. Although this solution is system-dependent, it solves the difficult problem well. On systems where GNU .Pn file , version 5.04 or higher, is available it should be used. One needs to specify the following profile entry to do so: .VS Mime-Type-Query: file -b --mime VE .LP Other versions of .Pn file might possibly be usable with wrapper scripts that reformat the output. The diversity among .Pn file implementations is great; one needs to check the local variant. .P It is not possible in mmh to override the automatic MIME type guessing for a specific file. To do so, either the user would need to know in advance for which file the automatic guessing fails or the system would require interaction. I consider both cases impractical. The existing solution should be sufficient. If not, the user may always fall back to .Pn mhbuild composition drafts and bypass the attachment system. .U3 "Storing Attachments .P Extracting MIME parts of a message and storing them to disk is performed by .Pn mhstore . The program has two operation modes, .Sw -auto and .Sw -noauto . With the former one, each part is stored under the filename given in the MIME part's meta information, if available. This naming information is usually available for modern attachments. If no filename is available, this MIME part is stored as if .Sw -noauto would have been specified. In the .Sw -noauto mode, the parts are processed according to the rules that are defined by .Pe mhstore-store-* profile entries. These rules define generic filename templates for storing or commands to post-process the contents in arbitrary ways. If no matching rule is available the part is stored under a generic filename, built from message number, MIME part number, and MIME type. .P The .Sw -noauto mode had been the default in nmh because it was considered safe, in contrast to the .Sw -auto mode. In mmh, .Sw -auto is not dangerous anymore. Two changes were necessary: .LI 1 Any directory path is removed from the proposed filename. Thus, the files are always stored in the expected directory. .Ci 41b6eadbcecf63c9a66aa5e582011987494abefb .LI 2 Tar files are not extracted automatically any more. Thus, the rest of the file system will not be touched. .Ci 94c80042eae3383c812d9552089953f9846b1bb6 .P In mmh, the result of .Cl "mhstore -auto can be foreseen from the output of .Cl "mhlist -verbose" . Although the .Sw -noauto mode is considered to be more powerful, it is less convenient and .Sw -auto is safe now. Additionally, storing attachments under their original name is intuitive. Hence, .Sw -auto serves better as the default option .Ci 3410b680416c49a7617491af38bc1929855a331d . .P Files are stored into the directory given by the .Pe Nmh-Storage profile entry, if set, or into the current working directory, otherwise. Storing to different directories is only possible with .Pe mhstore-store-* profile entries. .P Still existing files get overwritten silently in both modes. This can be considered a bug. Yet, each other behavior has its draw-backs, too. Refusing to replace files requires adding a .Sw -force switch. Users will likely need to invoke .Pn mhstore a second time with .Sw -force . Eventually, only the user can decide in the specific case. This requires interaction, which I like to avoid if possible. Appending a unique suffix to the filename is another bad option. For now, the behavior remains as it is. .P In mmh, only MIME parts of type message are special in .Pn mhstore 's .Sw -auto mode. Instead of storing message/rfc822 parts as files to disk, they are stored as messages into the current mail folder. The same applies to message/partial, although the parts are automatically reassembled beforehand. MIME parts of type message/external-body are not automatically retrieved anymore. Instead, information on how to retrieve them is output. Not supporting this rare case saved nearly one thousand lines of code .Ci 55e1d8c654ee0f7c45b9361ce34617983b454c32 . The MIME type `application/octet-stream; type=tar' is not special anymore. The automatically extracting of such MIME parts had been the dangerous part of the .Sw -auto mode .Ci 94c80042eae3383c812d9552089953f9846b1bb6 . .U3 "Showing MIME Messages .Id showing-mime-msgs .P The program .Pn mhshow was written to display MIME messages. It implemented the conceptional view of the MIME RFCs. Nmh's .Pn mhshow handles each MIME part independently, presenting them separately to the user. This does not match today's understanding of email attachments, where displaying a message is seen to be a single, integrated operation. Today, email messages are expected to consist of a main text part plus possibly attachments. They are no more seen to be arbitrary MIME hierarchies with information on how to display the individual parts. I adjusted .Pn mhshow 's behavior to the modern view on the topic. .P One should note that this section completely ignores the original .Pn show program, because it was not capable to display MIME messages and is no longer part of mmh (cf. Sec. .Cf mhshow ). Although .Pn mhshow was renamed to .Pn show in mmh, this section uses the name .Pn mhshow , in order to avoid confusion. .P In mmh, the basic idea is that .Pn mhshow should display a message in one single pager session. Therefore, .Pn mhshow invokes a pager session for all its output, whenever it prints to a terminal .Ci a4197ea6ffc5c1550e8b52d5a654bcaaaee04a4e . In consequence, .Pn mhl does no more invoke a pager .Ci 0e46503be3c855bddaeae3843e1b659279c35d70 . With .Pn mhshow replacing the original .Pn show , the output of .Pn mhl no longer goes to the terminal directly, but through .Pn mhshow . Hence, .Pn mhl does not need to invoke a pager. The one and only job of .Pn mhl is to format messages or parts of them. The only place in mmh, where a pager is invoked is .Pn mhshow . .P Only text content is displayed. Other kinds of attachments are ignored. Non-text content needs to be converted to text by appropriate .Pe mhshow-show-* profile entries before, if this is possible and wanted. A common example for this are PDF files. .P MIME parts are always displayed serially. The request to display the MIME type `multipart/parallel' in parallel is ignored. It is simply treated as `multipart/mixed' .Ci d0581ba306a7299113a346f9b4c46ce97bc4cef6 . This was already possible to request with the .Sw -serialonly switch of .Pn mhshow , which is now removed. As MIME parts are always processed exclusively, i.e. serially, the `\fL%e\fP' escape in .Pe mhshow-show-* profile entries became useless and was thus removed .Ci a20d405db09b7ccca74d3e8c57550883da49e1ae . For parallel display, the attachments need to be stored to disk first. .P To display text content in foreign charsets, they need to be converted to the native charset. Therefore, .Pe mhshow-charset-* profile entries were needed. In mmh, the conversion is performed automatically by piping the text through the .Pn iconv command, if necessary .Ci 2433122c20baccb10b70b49c04c6b0497b5b3b60 . Custom .Pe mhshow-show-* rules for textual content might need a .Cl "iconv -f %c %f | prefix to have the text converted to the native charset. .P Although the conversion of foreign charsets to the native one has improved, it is not consistent enough. Further work needs to be done and the basic concepts in this field need to be re-thought. Though, the default setup of mmh displays message in foreign charsets correctly without the need to configure anything. .ig .P mhshow/mhstore: Removed support for retrieving message/external-body parts. These tools will not download the contents automatically anymore. Instead, they print the information needed to get the contents. If someone should really receive one of those rare message/external-body messages, he can do the job manually. We save nearly a thousand lines of code. That's worth it! (The profile entry `nmh-access-ftp' and sbr/ruserpass.c for reading ~/.netrc are gone now.) .Ci 55e1d8c654ee0f7c45b9361ce34617983b454c32 .. .H2 "Signing and Encrypting .P Nmh offers no direct support for digital signatures and message encryption. This functionality needed to be added through third-party software. In mmh, the functionality is included because it is a part of modern email and is likely wanted by users of mmh. A fresh mmh installation supports signing and encrypting out-of-the-box. Therefore, Neil Rickert's .Pn mhsign and .Pn mhpgp scripts .[ neil rickert mhsign mhpgp .] were included .Ci f45cdc98117a84f071759462c7ae212f4bc5ab2e .Ci 58cf09aa36e9f7f352a127158bbf1c5678bc6ed8 . The scripts fit well because they are lightweight and similar of style to the existing tools. Additionally, no licensing difficulties appeared as they are part of the public domain. .P .Pn mhsign handles the signing and encrypting part. It comprises about 250 lines of shell code and interfaces between .Pn gnupg and the MH system. It was meant to be invoked manually at the WhatNow prompt, but in mmh, .Pn send invokes .Pn mhsign automatically .Ci c7b5e1df086bcc37ff40163ee67571f076cf6683 . Special header fields were introduced to request this action. If a draft contains the .Hd Sign header field, .Pn send will initiate the signing. The signing key is either chosen automatically or it is specified by the .Pe Pgpkey profile entry. .Pn send always create signatures using the PGP/MIME standard [RFC\|4880], but by invoking .Pn mhsign manually, old-style non-MIME signatures can be created as well. To encrypt an outgoing message, the draft needs to contain an .Hd Enc header field. Public keys of all recipients are searched for in the gnupg keyring and in a file called .Fn pgpkeys , which contains exceptions and overrides. Unless public keys are found for all recipients, .Pn mhsign will refuse to encrypt it. Currently, messages with hidden (BCC) recipients can not be encrypted. This work is pending because it requires a structurally more complex approach. .P .Pn mhpgp is the companion to .Pn mhsign . It verifies signatures and decrypts messages. Encrypted messages can be either temporarily decrypted and displayed or permanently decrypted and stored into the current folder. Currently, .Pn mhpgp needs to be invoked manually. The integration into .Pn show and .Pn mhstore to verify signatures and decrypt messages as needed is planned but not yet realized. .P Both scripts were written for nmh. Hence they needed to be adjust according to the differences between nmh and mmh. For instance, they use the backup prefix no longer. Furthermore, compatibility support for old PGP features was dropped. .P The integrated message signing and encrypting support is one of the most recent features in mmh. It has not had the time to mature. User feedback and personal experience need to be accumulated to direct the further development of the facility. Already it seems to be worthwhile to consider adding .Sw -[no]sign and .Sw -[no]enc switches to .Pn send , to be able to override the corresponding header fields. A profile entry: .VS send: -sign VE would then activate signing for all outgoing messages. With the present approach, a .Hd Send header component needs to be added to each draft template to achieve the same result. Adding the switches would ease the work greatly and keep the template files clean. .H2 "Draft and Trash Folder .P .U3 "Draft Folder .Id draft-folder .P In the beginning, MH had the concept of a draft message. This was a file named .Fn draft in the MH directory, which was treated special. On composing a message, this draft file was used. When starting to compose another message before the former one was sent, the user had to decide among: .LI 1 Using the old draft to finish and send it before starting with a new one. .LI 2 Discarding the old draft and replacing it with a new one. .LI 3 Preserving the old draft by refiling it to a folder. .LP Working on multiple drafts was only possible in alternation. For that, the current draft needed to be refiled to a folder and another one re-used for editing. Working on multiple drafts at the same time was impossible. The usual approach of switching to a different MH context did not help anything. .P The draft folder facility exists to allow true parallel editing of drafts, in a straight forward way. It was introduced by Marshall T. Rose, already in 1984. Similar to other new features, the draft folder was inactive by default. Even in nmh, the highly useful draft folder was not available out-of-the-box. At least, Richard Coleman added the man page .Mp mh-draft (5) to better document the feature. .P Not using the draft folder facility has the single advantage of having the draft file at a static location. This is simple in simple cases but the concept does not scale for more complex cases. The concept of the draft message is too limited for the problem it tries to solve. Therefore the draft folder was introduced. It is the more powerful and more natural concept. The draft folder is a folder like any other folder in MH. Its messages can be listed like any other messages. A draft message is no longer a special case. Tools do not need special switches to work on the draft message. Hence corner cases were removed. .P The trivial part of the work was activating the draft folder with a default name. I chose the name .Fn +drafts , for obvious reasons. In consequence, the command line switches .Sw -draftfolder and .Sw -draftmessage could be removed. More difficult, but also more improving, was the updating of the tools to the new concept. By fully switching to the draft folder, the tools could be simplified by dropping the awkward draft message handling code. .Sw -draft switches were removed because operating on a draft message is no longer special. It became indistinguishable to operating on any other message. .Ci 337338b404931f06f0db2119c9e145e8ca5a9860 .P There is no more need to query the user for draft handling .Ci 2d48b455c303a807041c35e4248955f8bec59eeb . It is always possible to add another new draft. Refiling drafts is without difference to refiling other messages. All of these special cases are gone. Yet, one draft-related switch remained. .Pn comp still has .Sw -[no]use for switching between two modes: .LI 1 Modifying an existing draft, with .Sw -use . .LI 2 Composing a new draft, possibly taking some existing message as template, with .Sw -nouse , the default. .ZZ .RT .sp \n(PDu In either case, the behavior of .Pn comp is deterministic. .P .Pn send now operates on the current message in the draft folder by default. As message and folder can both be overridden by specifying them on the command line, it is possible to send any message in the mail storage by simply specifying its number and folder. In contrast to the other tools, .Pn send takes the draft folder as its default folder. .P Dropping the draft message concept in favor for the draft folder concept, replaced special cases with regular cases. This simplified the source code of the tools, as well as the concepts. In mmh, draft management does not break with the MH concepts but applies them. .Cl "scan +drafts" , for instance, is a truly natural request. .P Most of the work was already performed by Rose in the eighties. The original improvement of mmh is dropping the old draft message approach and thus simplifying the tools, the documentation, and the system as a whole. Although my part in the draft handling improvement was small, it was important. .U3 "Trash Folder .Id trash-folder .P Similar to the situation for drafts is the situation for removed messages. Historically, a message was ``deleted'' by prepending a specific \fIbackup prefix\fP, usually the comma character, to the file name. The specific file would then be ignored by MH because only files with names consisting of digits only are treated as messages. Although files remained in the file system, the messages were no longer visible in MH. To truly delete them, a maintenance job was needed. Usually a cron job was installed to delete them after a grace time. For instance: .VS find $HOME/Mail -type f -name ',*' -ctime +7 -delete VE In such a setup, the original message could be restored within the grace time interval by stripping the backup prefix from the file name \(en usually but not always. If the last message of a folder with six messages (\fL1-6\fP) was removed, message .Fn 6 , became file .Fn ,6 . If then a new message entered the same folder, it would be named with the number one above the highest existing message number. In this case the message would be named .Fn 6 , reusing the number. If this new message would be removed as well, then the backup of the former message becomes overwritten. Hence, the ability to restore removed messages did not only depend on the sweeping cron job but also on the removing of further messages. It is undesirable to have such obscure and complex mechanisms. The user should be given a small set of clear assertions, such as ``Removed files are restorable within a seven-day grace time.'' With the addition ``... unless a message with the same name in the same folder is removed before.'' the statement becomes complex. A user will hardly be able to keep track of all removals to know if the assertion still holds true for a specific file. In practice, the real mechanism is unclear to the user. .P Furthermore, the backup files were scattered within the whole mail storage. This complicated managing them. It was possible with the help of .Pn find , but everything is more convenient if the deleted messages are collected in one place. .P The profile entry .Pe rmmproc (previously named .Pe Delete-Prog ) was introduced very early to improve the situation. It could be set to any command, which would be executed to remove the specified messages. This had overridden the default action, described above. Refiling the to-be-removed files to a trash folder was the usual example. Nmh's man page .Mp rmm (1) proposes to set the .Pe rmmproc to .Cl "refile +d to move messages to the trash folder .Fn +d instead of renaming them with the backup prefix. The man page additionally proposes the expunge command .Cl "rm `mhpath +d all` to empty the trash folder. .P Removing messages in such a way has advantages: .LI 1 The mail storage is prevented from being cluttered with removed messages because they are all collected in one place. Existing and removed messages are thus separated more strictly. .LI 2 No backup files are silently overwritten. .LI 3 Most important, however, removed messages are kept in the MH domain. Messages in the trash folder can be listed like those in any other folder. Deleted messages can be displayed like any other messages. .Pn refile can restore deleted messages. All operations on deleted files are still covered by the MH tools. The trash folder is just like any other folder in the mail storage. .P Similar to the draft folder case, I dropped the old backup prefix approach in favor for replacing it by the better suiting trash folder system. Hence, .Pn rmm calls .Pn refile to move the to-be-removed message to the trash folder, .Fn +trash by default. To sweep it clean, the user can use .Cl "rmm -unlink +trash a" , where the .Sw -unlink switch causes the files to be unlinked. .Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173 .Ci ca0b3e830b86700d9e5e31b1784de2bdcaf58fc5 .P Dropping the legacy approach and converting to the new approach completely, simplified the code base. The relationship between .Pn rmm and .Pn refile was inverted. In mmh, .Pn rmm invokes .Pn refile . That used to be the other way round. Yet, the relationship is simpler now. Loops, like described in nmh's man page for .Mp refile (1), can no longer occur: .QS Since .Pn refile uses your .Pe rmmproc to delete the message, the .Pe rmmproc must NOT call .Pn refile without specifying .Sw -normmproc or you will create an infinite loop. .QE .LP .Pn rmm either unlinks a message with .Fu unlink() or invokes .Pn refile to move it to the trash folder. .Pn refile does not invoke any tools. .P By generalizing the message removal in the way that it became covered by the MH concepts made the whole system more powerful. .H2 "Modern Defaults .P Nmh has a bunch of convenience-improving features inactive by default, although one can expect every new user to want them active. The reason they are inactive by default is the wish to stay compatible with old versions. But what are old versions? Still, the highly useful draft folder facility has not been activated by default although it was introduced over twenty-five years ago. .[ rose romine real work .] The community seems not to care. .P In nmh, new users are required to first build up a profile before they can access the modern features. Without an extensive profile, the setup is hardly usable for modern emailing. The point is not the customization of the setup, but the need to activate generally useful facilities. Yet, the real problem lies less in enabling the features, as this is straight forward as soon as one knows what he wants. The real problem is that new users need deep insight into the project to discover the available but inactive features. To give an example, I needed one year of using nmh before I became aware of the existence of the attachment system. One could argue that this fact disqualifies my reading of the documentation. If I would have installed nmh from source back then, I could agree. Yet, I had used a pre-packaged version and had expected that it would just work. Nevertheless, I had been convinced by the concepts of MH already and I am a software developer, still I required a lot of time to discover the cool features. How can we expect users to be even more advanced than me, just to enable them to use MH in a convenient and modern way? Unless they are strongly convinced of the concepts, they will fail. I have seen friends of me giving up disappointed before they truly used the system, although they had been motivated in the beginning. New users suffer hard enough to get used to the tool chest approach, we developers should spare them further inconveniences. .P Maintaining compatibility for its own sake is bad, because the code base will collect more and more compatibility code. Sticking to the compatibility code means remaining limited; whereas adjusting to the changes renders the compatibility unnecessary. Keeping unused alternatives in the code for longer than a short grace time is a bad choice as they likely gather bugs by not being constantly tested. Also, the increased code size and the greater number of conditions increase the maintenance costs. If any MH implementation would be the back-end of widespread email clients with large user bases, compatibility would be more important. Yet, it appears as if this is not the case. Hence, compatibility is hardly important for technical reasons. Its importance originates from personal reasons rather. Nmh's user base is small and old. Changing the interfaces causes inconvenience to long-term users of MH. It forces them to change their many years old MH configurations. I do understand this aspect, but by sticking to the old users, new users are kept from entering the world of MH. But the future lies in new users. In consequence, mmh invites new users by providing a convenient and modern setup, readily usable out-of-the-box. .P In mmh, all modern features are active by default and many previous approaches are removed or only accessible in a manual way. New default features include: .BU The attachment system (\c .Hd Attach ) .Ci 8ff284ff9167eff8f5349481529332d59ed913b1 . .BU The draft folder facility (\c .Fn +drafts ) .Ci 337338b404931f06f0db2119c9e145e8ca5a9860 . .BU The unseen sequence (`u') .Ci c2360569e1d8d3678e294eb7c1354cb8bf7501c1 and the sequence negation prefix (`!') .Ci db74c2bd004b2dc9bf8086a6d8bf773ac051f3cc . .BU Quoting the original message in the reply .Ci 67411b1f95d6ec987b4c732459e1ba8a8ac192c6 . .BU Forwarding messages using MIME .Ci 6e271608b7b9c23771523f88d23a4d3593010cf1 . .LP An mmh setup with a profile that defines only the path to the mail storage, is already convenient to use. Again, Paul Vixie's supports the direction I took: ``the `main branch' should just be modern''. .[ paul vixie edginess nmh-workers .] .\" -------------------------------------------------------------- .H1 "Styling .P Kernighan and Pike have emphasized the importance of style in the preface of \fIThe Practice of Programming\fP: .[ [ kernighan pike practice of programming .], p. x] .QS Chapter 1 discusses programming style. Good style is so important to good programming that we have chosen to cover it first. .QE This section covers changes in mmh that were guided by the desire to improve on style. Many of them follow the advice given in the quoted book. .H2 "Code Style .Id code-style .P .U3 "Indentation Style .P Indentation styles are the holy cow of programming. Kernighan and Pike write: .[ [ kernighan pike practice of programming .], p. 10] .QS Programmers have always argued about the layout of programs, but the specific style is much less important than its consistent application. Pick one style, preferably ours, use it consistently, and don't waste time arguing. .QE .P I agree that the constant application is most important, but I believe that some styles have advantages over others. For instance the indentation with tab characters only. The number of tabs corresponds to the nesting level \(en one tab, one level. Tab characters provide flexible visual appearance because developers can adjust their width as preferred. There is no more need to check for the correct mixture of tabs and spaces. Two simple rules ensure the integrity and flexibility of the visual appearance: .LI 1 Leading whitespace must consist of tabs only. .LI 2 All other whitespace should be spaces. .LP Although reformatting existing code should be avoided, I did it. I did not waste time arguing; I just reformatted the code. .Ci a485ed478abbd599d8c9aab48934e7a26733ecb1 .U3 "Comments .P Kernighan and Pike demand: ``Don't belabor the obvious''. .[ [ kernighan pike practice of programming .], p. 23] Following the advice, I removed unnecessary comments. For instance, I removed all comments in the following code excerpt .Ci 426543622b377fc5d091455cba685e114b6df674 : .VS context_replace(curfolder, folder); /* update current folder */ seq_setcur(mp, mp->lowsel); /* update current message */ seq_save(mp); /* synchronize message sequences */ folder_free(mp); /* free folder/message structure */ context_save(); /* save the context file */ [...] int c; /* current character */ char *cp; /* miscellaneous character pointer */ [...] /* NUL-terminate the field */ *cp = '\0'; VE .P The information in each of the comments was present in the code statements already, except for the NUL-termination, which became obvious from the context. .U3 "Names .P Regarding this topic, Kernighan and Pike suggest: ``Use active names for functions''. .[ [ kernighan pike practice of programming .], p. 4] One application of this rule was the rename of .Fu check_charset() to .Fu is_native_charset() .Ci 8d77b48284c58c135a6b2787e721597346ab056d . The same change additionally fixed a violation of ``Be accurate'', .[ [ kernighan pike practice of programming .], p. 4] as the code did not match the expectation the function suggested. It did not compare charset names but prefixes of them only. In case the native charset was `ISO-8859-1', then .VS check_charset("ISO-8859-11", strlen("ISO-8859-11")) VE had returned true although the upper halves of the code pages are different. .P More important than using active names is using descriptive names. .VS m_unknown(in); /* the MAGIC invocation... */ VE Renaming the obscure .Fu m_unknown() function was a delightful event, although it made the code less funny .Ci 611d68d19204d7cbf5bd585391249cb5bafca846 . .P Magic numbers are generally considered bad style. Obviously, Kernighan and Pike agree: ``Give names to magic numbers''. .[ [ kernighan pike practice of programming .], p. 19] .P The argument .CW outnum of the function .Fu scan() in .Fn uip/scansbr.c holds the number of the message to be created. As well it encodes program logic with negative numbers and zero. This led to obscure code. I clarified the code by introducing two variables that extracted the hidden information: .VS int incing = (outnum > 0); int ismbox = (outnum != 0); VE The readable names are thus used in conditions; the variable .CW outnum is used only to extract ordinary message numbers .Ci b8b075c77be7794f3ae9ff0e8cedb12b48fd139f . .P Through the clarity improvement of the change detours in the program logic of related code parts became apparent. The implementation was simplified. This possibility to improve had been invisible before .Ci aa60b0ab5e804f8befa890c0a6df0e3143ce0723 . .P The names just described were a first step, yet the situation was further improved by giving names to the magic values of .CW outnum : .VS #define SCN_MBOX (-1) #define SCN_FOLD 0 VE The two variables were updated thereafter as well: .VS int incing = (outnum != SCN_MBOX && outnum != SCN_FOLD); int scanfolder = (outnum == SCN_FOLD); VE Furthermore, .CW ismbox was replaced by .CW scanfolder because that matched better to the program logic. .Ci 7ffb36d28e517a6f3a10272056fc127592ab1c19 .H2 "Structural Rework .P .ZZ Although the stylistic changes described already improve the readability of the source code, all of them were changes ``in the small''. Structural changes, in contrast, affect much larger code areas. They are more difficult to accomplish but lead to larger improvements, especially as they often influence the outer shape of the tools as well. .P At the end of their chapter on style, Kernighan and Pike ask: ``But why worry about style?'' .[ [ kernighan pike practice of programming .], p. 28]. Following are two examples of structural rework that demonstrate why style is important in the first place. .U3 "Rework of \f(CWanno\fP .P Until 2002, .Pn anno had six functional command line switches: .Sw -component and .Sw -text , each with an argument, and the two pairs of flags, .Sw -[no]date and .Sw -[no]inplace . Then Jon Steinhart introduced his attachment system. In need for more advanced annotation handling, he extended .Pn anno and added five more switches: .Sw -draft , .Sw -list , .Sw \%-delete , .Sw -append , and .Sw -number , the last one taking an argument .Ci 7480dbc14bc90f2d872d434205c0784704213252 . Later, .Sw -[no]preserve was added as well .Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f . Then, the Synopsis section of the man page .Mp anno (1) read: .VS anno [+folder] [msgs] [-component f(CIfieldfP] [-inplace | -noinplace] [-date | -nodate] [-draft] [-append] [-list] [-delete] [-number [f(CInumfP|fPallfP]] [-preserve | -nopreserve] [-version] [-help] [-text f(CIbodyfP] VE .LP The implementation followed the same structure. Problems became visible when .Cl "anno -list -number 42 worked on the current message instead of on message number 42, and .Cl "anno -list -number l:5 did not work on the last five messages but failed with the mysterious error message: ``anno: missing argument to -list''. Yet, the invocation matched the specification in the man page. There, the correct use of .Sw -number was defined as being .Cl "[-number [num|all]] and the textual description for the combination with .Sw -list read: .QS The .Sw -list option produces a listing of the field bodies for header fields with names matching the specified component, one per line. The listing is numbered, starting at 1, if the .Sw -number option is also used. .QE .LP The problem was manifold. Semantically, the argument to the .Sw -number switch is only necessary in combination with .Sw -delete , but not with .Sw -list . The code, however, required a numeric argument in any case. If the argument was missing or non-numeric, .Pn anno aborted with an error message that additionally had an off-by-one error. It printed the name of the switch one before the concerned one. .P Trying to fix these problems on the surface would not have solved them. They originate from a discrepance between the structure of the problem and the structure implemented in the program. Such structural differences can only be solved by adjusting the structure of the implementation to the structure of the problem. .P Steinhart had added the .Sw -list and .Sw -delete switches in the same way as the other switches though they are of structural different type. Semantically, .Sw -list and .Sw \%-delete introduce operation modes. Historically, .Pn anno had only one operation mode: adding header fields. With the extension, two more modes were added: listing and deleting header fields. The structure of the code changes did not pay respect to this fundamental change. Neither the implementation nor the documentation did clearly declare the exclusive operation modes as such. Having identified the problem, I solved it by putting structure into .Pn anno and its documentation .Ci d54c8db8bdf01e8381890f7729bc0ef4a055ea11 . .P The difference is visible in both the code and the documentation. For instance in the following code excerpt: .VS int delete = -2; /* delete header element if set */ int list = 0; /* list header elements if set */ [...] case DELETESW: /* delete annotations */ delete = 0; continue; case LISTSW: /* produce a listing */ list = 1; continue; VE .LP which was replaced by: .VS static enum { MODE_ADD, MODE_DEL, MODE_LIST } mode = MODE_ADD; [...] case DELETESW: /* delete annotations */ mode = MODE_DEL; continue; case LISTSW: /* produce a listing */ mode = MODE_LIST; continue; VE .LP The replacement code does not only reflect the problem's structure better, it is easier to understand as well. The same applies to the documentation. The man page was completely reorganized to propagate the same structure. This is already visible in the Synopsis section: .VS anno [+folder] [msgs] [-component f(CIfieldfP] [-text fPbodyfP] [-append] [-date | -nodate] [-preserve | -nopreserve] [-Version] [-help] anno -delete [+folder] [msgs] [-component fPfieldfP] [-text fPbodyfP] [-number fPnum fP| fPall fP] [-preserve | -nopreserve] [-Version] [-help] anno -list [+folder] [msgs] [-component fPfieldfP] [-number] [-Version] [-help] VE .U3 "Path Conversion .P Four kinds of path names can appear in MH: .LI 1 Absolute Unix directory paths, like .Fn /etc/passwd . .LI 2 Relative Unix directory paths, like .Fn ./foo/bar . .LI 3 Absolute MH folder paths, like .Fn +projects/mmh . .LI 4 Relative MH folder paths, like .Fn @subfolder . .LP Relative MH folder paths, are hardly documented although they are useful for large mail storages. The current mail folder is specified as `\c .Fn @ ', just like the current directory is specified as `\c .Fn . '. .P To allow MH tools to understand all four notations, they need to be able to convert between them. In nmh, these path name conversion functions were located in the files .Fn sbr/path.c (``return a pathname'') and .Fn sbr/m_maildir.c (``get the path for the mail directory''). The seven functions in the two files were documented with no more than two comments, which described obvious information. The signatures of the four exported functions did not explain their semantics: .LI 1 .CW "char *path(char *, int); .LI 2 .CW "char *pluspath(char *); .LI 3 .CW "char *m_mailpath(char *); .LI 4 .CW "char *m_maildir(char *); .P My investigations provided the following descriptions: .LI 1 The second parameter of .Fu path() defines the type as which the path given in the first parameter should be treated. Directory paths are converted to absolute directory paths. Folder paths are converted to absolute folder paths. Folder paths must not include a leading `\fL@\fP' character. Leading plus characters are preserved. The result is a pointer to newly allocated memory. .LI 2 .Fu pluspath() is a convenience-wrapper to .Fu path() , to convert folder paths only. This function can not be used for directory paths. An empty string parameter causes a buffer overflow. .LI 3 .Fu m_mailpath() converts directory paths to absolute directory paths. The characters `\fL+\fP' or `\fL@\fP' at the beginning of the path name are treated literal, i.e. as the first character of a relative directory path. Hence, this function can not be used for folder paths. In any case, the result is an absolute directory path, returned as a pointer to newly allocated memory. .LI 4 .Fu m_maildir() returns the parameter unchanged if it is an absolute directory path or begins with the entry `\fL.\fP' or `\fL..\fP'. All other strings are prepended with the current working directory. Hence, this function can not be used for folder paths. The result is either an absolute directory path or a relative directory path, starting with dot or dot-dot. In contrast to the other functions, the result is a pointer to static memory. .P The situation was obscure, irritating, error-prone, and non-orthogonal. Additionally, no clear terminology was used to name the different kinds of path names. Sometimes, the names were even misleading, much as the first argument of .Fu m_mailpath() , which was named .CW folder , although .Fu m_mailpath() could not be used with MH folder arguments. .P I clarified the path name conversion by complete rework. First of all, the terminology needed to be defined. A path name is either in the Unix domain, then it is called \fIdirectory path\fP or it is in the MH domain, then it is called \fIfolder path\fP. The two terms need to be used with strict distinction. Second, I exploited the concept of path type indicators. By requiring every path name to start with a distinct type identifier, the conversion between the types could be fully automated. This allows the tools to accept path names of any type from the user. Therefore, it was necessary to require relative directory paths to be prefixed with a dot character. In consequence, the dot character could no longer be an alias for the current message .Ci cff0e16925e7edbd25b8b9d6d4fbdf03e0e60c01 . Third, I created three new functions to replace the previous mess: .LI 1 .Fu expandfol() converts folder paths to absolute folder paths. Directory paths are simply passed through. This function is to be used for folder paths only, thus the name. The result is a pointer to static memory. .LI 2 .Fu expanddir() converts directory paths to absolute directory paths. Folder paths are treated as relative directory paths. This function is to be used for directory paths only, thus the name. The result is a pointer to static memory. .LI 3 .Fu toabsdir() converts any type of path to an absolute directory path. This is the function of choice for path conversion. Absolute directory paths are the most general representation of a path name. The result is a pointer to static memory. .P The new functions have names that indicate their use. Two of the functions convert relative to absolute path names of the same type. The third function converts any path name type to the most general one, the absolute directory path. All of the functions return pointers to static memory. The file .Fn sbr/path.c contains the implementation of the functions; .Fn sbr/m_maildir.c was removed. .Ci d39e2c447b0d163a5a63f480b23d06edb7a73aa0 .P Along with the path conversion rework, I also replaced .Fu getfolder(FDEF) with .Fu getdeffol() and .Fu getfolder(FCUR) with .Fu getcurfol() , which only wraps .Fu expandfol(""@"") for convenience. This code was moved from .Fn sbr/getfolder.c into .Fn sbr/path.c as well. .Ci d39e2c447b0d163a5a63f480b23d06edb7a73aa0 .P The related function .Fu etcpath() is now included in .Fn sbr/path.c , too .Ci b4c29794c12099556151d93a860ee51badae2e35 . Previously, it had been located in .Fn config/config.c . .P Now, .Fn sbr/path.c contains all path handling code. Besides being less code, its readability is highly improved. The functions follow a common style and are well documented. .H2 "Profile Reading .P The MH profile contains the configuration of a user-specific MH setup. MH tools read the profile right after starting up because it contains the location of the user's mail storage and similar settings that influence the whole setup. Furthermore, the profile contains the default switches for the tools as well. The context file is read along with the profile. .P For historic reasons, some MH tools did not read the profile and context. Among them were .Pn post /\c .Pn spost , .Pn mhmail , and .Pn slocal . The reason why these tools ignored the profile were not clearly stated. During a discussion on the nmh-workers mailing list, David Levine posted an explanation, quoting John Romine: .[ nmh-workers levine post profile .] .QS I asked John Romine and here's what he had to say, which agrees and provides an example that convinces me: .QS My take on this is that .Pn post should not be called by users directly, and it doesn't read the .Fn .mh_profile (only front-end UI programs read the profile). .QP For example, there can be contexts where .Pn post is called by a helper program (like `\c .Pn mhmail ') which may be run by a non-MH user. We don't want this to prompt the user to create an MH profile, etc. .QP My suggestion would be to have .Pn send pass a (hidden) `\c .Sw -fileproc .Ar proc ' option to .Pn post if needed. You could also use an environment variable (I think .Pn send /\c .Pn whatnow do this). .QE .sp \n(PDu I think that's the way to go. My personal preference is to use a command line option, not an environment variable. .QE .P To solve the problem that .Pn post does not honor the .Pe fileproc profile entry, the community roughly agreed that a switch .Sw -fileproc should be added to .Pn post to be able to pass a different fileproc. I strongly disagree with this approach because it does not solve the problem; it only removes a single symptom. The actual problem is that .Pn post does not behave as expected, though all programs should behave as expected. Clear and general concepts are a precondition for this. Thus, there should be no separation into ``front-end UI programs'' and ones that ``should not be called by users directly''. The real solution is having all MH tools read the profile. .P But the problem has a further aspect, which originates from .Pn mhmail mainly. .Pn mhmail was intended to be a replacement for .Pn mailx on systems with MH installations. In difference to .Pn mailx , .Pn mhmail used MH's .Pn post to send the message. The idea was that using .Pn mhmail should not be influenced whether the user had MH set up for himself or not. Therefore .Pn mhmail had not read the profile. As .Pn mhmail used .Pn post , .Pn post was not allowed to read the profile neither. This is the reason for the actual problem. Yet, this was not considered much of a problem because .Pn post was not intended to be used by users directly. To invoke .Pn post , .Pn send was used an a front-end. .Pn send read the profile and passed all relevant values on the command line to .Pn post \(en an awkward solution. .P The important insight is that .Pn mhmail is a wolf in sheep's clothing. This alien tool broke the concepts because it was treated like a normal MH tool. Instead it should have been treated accordingly to its foreign style. .P The solution is not to prevent the tools from reading the profile but to instruct them to read a different profile. .Pn mhmail could have set up a well-defined profile and caused the following .Pn post to use this profile by exporting an environment variable. With this approach, no special cases would have been introduced and no surprises would have been caused. By writing a wrapper program to provide a clean temporary profile, the concept could have been generalized orthogonally to the whole MH tool chest. .P In mmh, the wish to have .Pn mhmail as a replacement for .Pn mailx is considered obsolete. Mmh's .Pn mhmail does no longer cover this use-case .Ci d36e56e695fe1c482c7920644bfbb6386ac9edb0 . Currently, .Pn mhmail is in a transition state .Ci 32d4f9daaa70519be3072479232ff7be0500d009 . It may become a front-end to .Pn comp , which provides an alternative interface which can be more convenient in some cases. This would convert .Pn mhmail into an ordinary MH tool. If, however, this idea does not convince, then .Pn mhmail will be removed. .P .ZZ -1 In the mmh tool chest, every program reads the profile. (\c .Pn slocal is not considered part of the mmh tool chest (cf. Sec. .Cf slocal ).) Mmh has no .Pn post program, but it has .Pn spost , which now does read the profile .Ci 3e017a7abbdf69bf0dff7a4073275961eda1ded8 . Following this change, .Pn send and .Pn spost can be considered for merging. Besides .Pn send , .Pn spost is only invoked directly by the to-be-changed .Pn mhmail implementation and by .Pn rcvdist , which requires rework anyway. .P Jeffrey Honig quoted Marshall T. Rose explaining the decision that .Pn post ignores the profile: .[ nmh-workers honig post profile .] .QS when you run mh commands in a script, you want all the defaults to be what the man page says. when you run a command by hand, then you want your own defaults... .QE .LP The explanation neither matches the problem concerned exactly nor is the interpretation clear. If the described desire addresses the technical level, then it conflicts with the Unix philosophy, precisely because the indistinquishability of human and script input is the main reason for the huge software leverage in Unix. If, however, the described desire addresses the user's view, then different technical solutions are more appropriate. The two cases can be regarded simply as two different MH setups. Hence, mapping the problem of different behavior between interactive and automated use on the concept of switching between different profiles, marks it already solved. .H2 "Standard Libraries .P MH is one decade older than the POSIX and ANSI C standards. Hence, MH included own implementations of functions that were neither standardized nor widely available, back then. Today, twenty years after POSIX and ANSI C were published, developers can expect that systems comply with these standards. In consequence, MH-specific replacements for standard functions can and should be dropped. Kernighan and Pike advise: ``Use standard libraries''. .[ [ kernighan pike practice of programming .], p. 196] Actually, MH had followed this advice in history, but it had not adjusted to more recent changes in this field. The .Fu snprintf() function, for instance, was standardized with C99 and is available almost everywhere because of its high usefulness. Thus, the project's own implementation of .Fu snprintf() was dropped in March 2012 in favor for using the one of the standard library .Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32 . Such decisions limit the portability of mmh if systems do not support these standardized and widespread functions. This compromise is made because mmh focuses on the future. .P .ZZ As I am still in my twenties, have no programming experience from past decades. I have not followed the evolution of C through time. I have not suffered from the the Unix wars. I have not longed for standardization. All my programming experience is from a time when ANSI C and POSIX were well established already. Thus, I needed to learn about the history in retrospective. I have only read a lot of books about the (good) old times. This put me in a difficult position when working with old code. I need to freshly acquire knowledge about old code constructs and ancient programming styles, whereas older programmers know these things by heart from their own experience. Being aware of the situation, I rather let people with more historic experience do the transition from ancient code constructs to standardized ones. Lyndon Nerenberg covered large parts of this task for the nmh project. He converted project-specific functions to POSIX replacements, also removing the conditionals compilation of now standardized features. Ken Hornstein and David Levine had their part in this work, as well. Often, I only pulled the changes over from nmh into mmh. These changes include many commits, among them: .Ci 768b5edd9623b7238e12ec8dfc409b82a1ed9e2d .Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32 . .P Nevertheless, I worked on the task as well, tidying up the \fIMH standard library\fP, .Fn libmh.a . It is located in the .Fn sbr (``subroutines'') directory in the source tree and includes functions that mmh tools usually need. Among them are MH-specific functions for profile, context, sequence, and folder handling, but as well MH-independent functions, such as auxiliary string functions, portability interfaces and error-checking wrappers for critical functions of the standard library. .BU I have replaced the .Fu atooi() function with calls to .Fu strtoul() , setting the third parameter, the base, to eight. .Fu strtoul() is part of C89 and thus considered safe to use .Ci c490c51b3c0f8871b6953bd0c74551404f840a74 . .BU I did remove project-included fallback implementations of .Fu memmove() and .Fu strerror() .Ci b067ff5c465a5d243ce5a19e562085a9a1a97215 , although Peter Maydell had re-included them into nmh in 2008 to support SunOS 4. Nevertheless, these functions are part of ANSI C. Systems that do not even provide full ANSI C support should not put a load on mmh. .BU The .Fu copy() function copies the string in parameter one to the location in parameter two. In contrast to .Fu strcpy() , it returns a pointer to the terminating null-byte in the destination area. The code was adjusted to replace .Fu copy() with .Fu strcpy() , except within .Fu concat() , where .Fu copy() was more convenient. Therefore, the definition of .Fu copy() was moved into the source file of .Fu concat() and its visibility it limited to that .Ci 552fd7253e5ee9e554c5c7a8248a6322aa4363bb . .BU The function .Fu r1bindex() had been a generalized version of .Fu basename() with minor differences. As all calls to .Fu r1bindex() had the slash (`\fL/\fP') as delimiter anyway, replacing .Fu r1bindex() with the more specific and better-named function .Fu basename() became desirable. Unfortunately, many of the 54 calls to .Fu r1bindex() depended on a special behavior, which differed from the POSIX specification for .Fu basename() . Hence, .Fu r1bindex() was kept but renamed to .Fu mhbasename() , setting the delimiter to the slash .Ci 240013872c392fe644bd4f79382d9f5314b4ea60 . For possible uses of .Fu r1bindex() with a different delimiter, the ANSI C function .Fu strrchr() provides the core functionality. .BU .ZZ The .Fu ssequal() function \(en apparently for ``substring equal'' \(en was renamed to .Fu isprefix() , because this is what it actually checked .Ci c20b4fa14515c7ab388ce35411d89a7a92300711. Its source file had included both of the following comments, no joke. .in -\n(PIu .VS /* * THIS CODE DOES NOT WORK AS ADVERTISED. * It is actually checking if s1 is a PREFIX of s2. * All calls to this function need to be checked to see * if that needs to be changed. Prefix checking is cheaper, so * should be kept if it's sufficient. */ sp .5 /* * Check if s1 is a substring of s2. * If yes, then return 1, else return 0. */ VE .in +\n(PIu Eventually, the function was completely replaced with calls to .Fu strncmp() .Ci b0b1dd37ff515578cf7cba51625189eb34a196cb . .H2 "User Data Locations .P In nmh, a personal setup consists of the MH profile and the MH directory. The profile is a file named .Fn \&.mh_profile in the user's home directory. It contains the static configuration. It also contains the location of the MH directory in the profile entry .Pe Path . The MH directory contains the mail storage and is the first place to search for form files, scan formats, and similar configuration files. The location of the MH directory can be chosen freely by the user. The usual name is a directory named .Fn Mail in the user's home directory. .P The way MH data is split between profile and MH directory is a legacy. It is only sensible in a situation where the profile is the only configuration file. Why else should the mail storage and the configuration files be intermixed? They are of different kind: One kind is the data to be operated on and the other kind is the configuration to change how tools operate. Splitting the configuration between the profile and the MH directory is inappropriate, as well. I improved the situation by breaking compatibility. .P In mmh, personal data is grouped by type. This results in two distinct parts: the mail storage and the configuration. The mail storage directory still contains all the messages, but, in exception of public sequences files, nothing else. In difference to nmh, the auxiliary configuration files are no longer located there. Therefore, the directory is no longer called the user's \fIMH directory\fP but the user's \fImail storage\fP. Its location is still user-chosen, with the default name .Fn Mail in the user's home directory. The configuration is grouped together in the hidden directory .Fn \&.mmh in the user's home directory. This \fImmh directory\fP contains the context file, personal forms, scan formats, and the like, but also the user's profile, now named .Fn profile . The path to the profile is no longer .Fn $HOME/.mh_profile but .Fn $HOME/.mmh/profile . (The alternative of having file .Fn $HOME/.mh_profile and a configuration directory .Fn $HOME/.mmh appeared to be inconsistent.) .P The approach chosen for mmh is consistent, simple, and familiar to Unix users. The main achievement of the change is the clear and sensible separation of the mail storage and the configuration. .Ci 7030d7edb099bff36ded7548bb5380f7acab4f9b .P As MH allows users to have multiple MH setups, it is necessary to switch the profile. The profile is the single entry point to access the rest of a personal MH setup. In nmh, the environment variable .Ev MH is used to specify a different profile. To operate in the same MH setup with a separate context, the .Ev MHCONTEXT environment variable is used. This allows having a separate current folder in each terminal at the same time, for instance. In mmh, three environment variables replace the two of nmh. .Ev MMH overrides the default location of the mmh directory (\c .Fn .mmh ). .Ev MMHP and .Ev MMHC override the paths to the profile and context file, respectively. This approach allows the set of personal configuration files to be chosen independently of the profile, context, and mail storage. The new approach has no functional disadvantages, as every setup I can imagine can be implemented with both approaches, possibly even easier with the new one. .Ci 7030d7edb099bff36ded7548bb5380f7acab4f9b .H2 "Modularization .Id modularization .P The source code of the mmh tools is located in the .Fn uip (``user interface programs'') directory. Each tool has a source file with the name of the command. For example, .Pn rmm is built from .Fn uip/rmm.c . Some source files are used for multiple programs. For example .Fn uip/scansbr.c is used for both .Pn scan and .Pn inc . In nmh, 49 tools were built from 76 source files. This is a ratio of 1.6 source files per program. 32 programs depended on multiple source files; 17 programs depended on one source file only. In mmh, 39 tools are built from 51 source files. This is a ratio of 1.3 source files per program. 18 programs depend on multiple source files; 21 programs depend on one source file only. (These numbers and the ones in the following text ignore the MH library as well as shell scripts and multiple names for the same program.) .P Splitting the source code of a large program into multiple files can increase the readability of its source code, but most of the mmh tools are small and straight-forward programs. In exception of the MIME handling tools (i.e. .Pn mhbuild , .Pn mhstore , .Pn show , etc.), .Pn pick is the only tool with more than one thousand lines of source code. Splitting programs with less than one thousand lines of code into multiple source files leads seldom to better readability. For such tools, splitting still makes sense when parts of the code are reused in other programs and the reused code fragment is (1) not general enough for including it in the MH library or (2) has dependencies on a library that only few programs need. .Fn uip/packsbr.c , for instance, provides the core program logic for the .Pn packf and .Pn rcvpack programs. .Fn uip/packf.c and .Fn uip/rcvpack.c mainly wrap the core function appropriately. No other tools use the folder packing functions. As another example, .Fn uip/termsbr.c accesses terminal properties, which requires linking with the \fItermcap\fP or a \fIcurses\fP library. If .Fn uip/termsbr.c is included in the MH library, then every program needs to be linked with termcap or curses, although only few of the programs use the library. .P The task of MIME handling is complex enough that splitting its code into multiple source files improves the readability. The program .Pn mhstore , for instance, is compiled out of seven source files with 2\|500 lines of code in summary. The main code file .Fn uip/mhstore.c consists of 800 lines; the other 1\|700 lines are code reused in other MIME handling tools. It seems to be worthwhile to bundle the generic MIME handling code into a MH-MIME library, as a companion to the MH standard library. This is left to be done. .P The work already accomplished focussed on the non-MIME tools. The amount of code compiled into each program was reduced. This eases the understanding of the code base. In nmh, .Pn comp was built from six source files: .Fn comp.c , .Fn whatnowproc.c , .Fn whatnowsbr.c , .Fn sendsbr.c , .Fn annosbr.c , and .Fn distsbr.c . In mmh, it builds from only two: .Fn comp.c and .Fn whatnowproc.c . In nmh's .Pn comp , the core function of .Pn whatnow , .Pn send , and .Pn anno were all compiled into .Pn comp . This saved the need to execute these programs with the expensive system calls .Fu fork() and .Fu exec() . Whereas this approach improved the time performance, it interwove the source code. Core functionalities were not encapsulated into programs but into function, which were then wrapped by programs. For example, .Fn uip/annosbr.c included the function .Fu annotate() . Each program that wanted to annotate messages, included the source file .Fn uip/annosbr.c and called .Fu annotate() . Because the function .Fu annotate() was used like the tool .Pn anno , it had seven parameters, reflecting the command line switches of the tool. When another pair of command line switches was added to .Pn anno , a rather ugly hack was implemented to avoid adding another parameter to the function .Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f . .P In mmh, the relevant code of .Pn comp comprises the two files .Fn uip/comp.c and .Fn uip/whatnowproc.c , together 210 lines of code, whereas in nmh, .Pn comp comprises six files with 2\|450 lines. Not all of the code in these six files is actually used by .Pn comp , but the reader needed to read it all to know which parts are relevant. Understanding nmh's .Pn comp , required understanding the inner workings of .Fn uip/annosbr.c first. To be sure to fully understand a program, its whole source code needs to be examined. Not doing so is a leap of faith, assuming that the developers have avoided obscure programming techniques. Here, it should be recalled that information passed in obscure ways through the program's source base, due to the aforementioned hack to save an additional parameter in nmh's .Pn anno . .P In mmh, understanding .Pn comp requires to read only 210 lines of code to read, whereas the amount is ten times more for nmh's .Pn comp . .P By separating the tools on the program-level, the boundaries are clearly visible, as the interfaces are calls to .Fu exec() rather than arbitrary function calls. Additionally, this kind of separation is more strict because it is technically enforced by the operating system; it can not be simply bypassed with global variables. Good separation simplifies the understanding of program code because the area influenced by any particular statement is small. As I have read a lot in nmh's code base during the last two years, I have learned about the easy and the difficult parts. In my observation, the understanding of code is enormously eased if the influenced area is small and clearly bounded. .P Yet, the real problem is another: Nmh violates the golden ``one tool, one job'' rule of the Unix philosophy. Understanding .Pn comp requires understanding .Fn uip/annosbr.c and .Fn uip/sendsbr.c because .Pn comp annotates and sends messages. In nmh, there surely exist the tools .Pn anno and .Pn send , which cover these jobs, but .Pn comp and .Pn repl and .Pn forw and .Pn dist and .Pn whatnow and .Pn viamail \(en they all (!) \(en have the same annotating and sending functions included, once more. As a result, .Pn comp sends messages without using .Pn send . The situation is the same as if .Pn grep would page its output without using .Pn more just because both programs are part of the same code base. .P The clear separation on the surface of nmh \(en the tool chest approach \(en is violated on the level below. This violation is for the sake of time performance. Decades ago, sacrificing readability and conceptional beauty for speed might have been necessary to prevent MH from being unusably slow, but today this is not the case anymore. No longer should speed improvements that became unnecessary be kept. No longer should readability or conceptional beauty be sacrificed. No longer should the Unix philosophy's ``one tool, one job'' guideline be violated. Therefore, mmh's .Pn comp no longer sends messages. .P In mmh, different jobs are divided among separate programs that invoke each other as needed. In consequence, .Pn comp invokes .Pn whatnow which thereafter invokes .Pn send .Ci 3df5ab3c116e6d4a2fb4bb5cc9dfc5f781825815 .Ci c73c00bfccd22ec77e9593f47462aeca4a8cd9c0 . The clear separation on the surface is maintained on the level below. Human users and other tools use the same interface \(en annotations, for example, are made by invoking .Pn anno , no matter if requested by programs or by human beings .Ci 469a4163c2a1a43731d412eaa5d9cae7d670c48b .Ci aed384169af5204b8002d06e7a22f89197963d2d .Ci 3caf9e298a8861729ca8b8a84f57022b6f3ea742 . .P .ZZ -1 The decrease of tools built from multiple source files and thus the decrease of .Fn uip/*sbr.c files confirm the improvement .Ci 9e6d91313f01c96b4058d6bf419a8ca9a207bc33 .ci 81744a46ac9f845d6c2b9908074d269275178d2e .Ci f0f858069d21111f0dbea510044593f89c9b0829 .Ci 0503a6e9be34f24858b55b555a5c948182b9f24b .Ci 27826f9353e0f0b04590b7d0f8f83e60462b90f0 .Ci d1da1f94ce62160aebb30df4063ccbc53768656b .Ci c42222869e318fff5dec395eca3e776db3075455 . This is also visible in the complexity of the build dependency graphs: .sp Nmh: .BP input/deps-nmh.eps .5i .EP .sp Mmh: .BP input/deps-mmh.eps .8i .EP The figures display all program to source file relationships where programs (ellipses) are built from multiple source files (rectangles). The primary source file of each program is omited from the graph.