view discussion.roff @ 189:22feb390ccc4

Applied suggestions by Lydi.
author markus schnalke <meillo@marmaro.de>
date Wed, 11 Jul 2012 15:53:53 +0200
parents 5360f5fdb118
children 05a243dffaca
line wrap: on
line source

.H0 "Discussion
.P
This main chapter discusses the practical work accomplished in the
mmh project.
It is structured along the goals set for the project.
The concrete work undertaken
is described in the examples of how the general goals were achieved.
The discussion compares the current version of mmh with the state of
nmh just before the mmh project started, i.e. fall 2011.
Current changes of nmh will be mentioned only as side notes.
.\" XXX where do I discuss the parallel development of nmh?
.P
For the reader's convenience, the structure of modern email systems
is depicted in the figure.
It illustrates the path a message takes from sender to recipient.
.sp
.KS
.in 2c
.so input/mail-agents.pic
.KE
.sp
.LP
The ellipses denote mail agents, i.e. different jobs in email processing:
.IP "Mail User Agent (MUA)
The only program the user interacts directly with.
It includes functions to compose new mail, display received mail,
and to manage the mail storage.
Also called \fImail client\fP.
.IP "Mail Submission Agent (MSA)
A special kind of Mail Transfer Agent, used to submit mail into the
mail transport system.
.IP "Mail Transfer Agent (MTA)
A node in the mail transport system.
Transfers incoming mail to a transport node nearer to the final destination.
It may be the final destination itself.
.IP "Mail Delivery Agent (MDA)
Delivers mail by storing it onto disk, usually according to a set of rules.
.IP "Mail Retrieval Agent (MRA)
Initiates the transfer of mail from a remote server to the local machine.
(The dashed arrow represents the pull request.)
.P
The dashed boxes represent groups that usually reside on single machines.
The box on the lower left represents the sender's local system.
The box on the upper left represents the first mail transfer node.
The box on the upper right represents the transfer node responsible for the
destination address.
The box on the lower right represents the recipient's local system.
Often, the boxes above the dotted line are servers on the Internet.
Many mail clients, including nmh, have all of the components below
the dotted line implemented.
Not so in mmh, which is an MUA only.






.\" --------------------------------------------------------------
.H1 "Streamlining

.P
MH once provided anything necessary for email handling.
The community around nmh has the similar understanding that nmh should
provide a complete email system.
In fundamental contrast, mmh shall be an MUA only.
I believe that the development of all-in-one mail systems is obsolete.
Today, email is too complex to be fully covered by a single project.
Such a project will not be able to excel in all aspects.
Instead, the aspects of email should be covered by multiple projects,
which then can be combined to form a complete system.
Excellent implementations for the various aspects of email already exist.
Just to name three examples: Postfix is a specialized MTA,
.\" XXX homepages verlinken
Procmail is a specialized MDA, and Fetchmail is a specialized MRA.
I believe that it is best to use such specialized tools instead of
providing the same function again as a side-component in the project.
.\" XXX mail agent picture here
.P
Doing something well requires focusing on a small set of specific aspects.
Under the assumption that development focussed on a particular area
produces better results there, specialized projects will be superior
in their field of focus.
Hence, all-in-one mail system projects \(en no matter if monolithic
or modular \(en will never be the best choice in any of the fields.
Even in providing the best consistent all-in-one system, they are likely
to be beaten by projects that focus only on integrating existing mail
components to create a homogeneous system.
.P
The limiting resource in the community development of free software
is usually man power.
.\" XXX FIXME ref!
If the development power is spread over a large development area,
it becomes even more difficult to compete with the specialists in the
various fields.
The concrete situation for MH-based mail systems is even tougher,
given their small and aged community, concerning both developers and users.
.P
In consequence, I believe that the available development resources
should focus on the point where MH is most unique.
This is clearly the user interface \(en the MUA.
Peripheral parts should be removed to streamline mmh for the MUA task.


.H2 "Mail Transfer Facilities
.Id mail-transfer-facilities
.P
In contrast to nmh, which also provides mail submission and mail retrieval
agents, mmh is an MUA only.
This general difference initiated the development of mmh.
The removal of the mail transfer facilities was the first work task
in the mmh project.
.P
Focusing on one mail agent role only, is motivated by Eric Allman's
experience with Sendmail.
He identified the limitation of Sendmail to the MTA task as one reason for
its success:
.[ [
costales sendmail
.], p. xviii]
.QS
Second, I limited myself to the routing function \(en
I wouldn't write user agents or delivery back-ends.
This was a departure of the dominant through of the time,
in which routing logic, local delivery, and often the network code
were incorporated directly into the user agents.
.QE
.P
In nmh, the MSA is called \fIMessage Transfer Service\fP (MTS).
This facility, implemented by the
.Pn post
command, established network connections and spoke SMTP to submit
messages to be relayed to the outside world.
The changes in email demanded changes in this part of nmh as well.
Encryption and authentication for network connections
needed to be supported, hence TLS and SASL were introduced into nmh.
This added complexity to nmh without improving it in its core functions.
Also, keeping up with recent developments in the field of
mail transfer requires development power and specialists.
In mmh, this whole facility was simply cut off.
.Ci f6aa95b724fd8c791164abe7ee5468bf5c34f226
.Ci fecd5d34f65597a4dfa16aeabea7d74b191532c3
.Ci 156d35f6425bea4c1ed3c4c79783dc613379c65b
Instead, mmh depends on an external MSA.
The only outgoing interface available to mmh is the
.Pn sendmail
command, which almost any MSA provides.
If not, a wrapper program can be written.
It must read the message from the standard input, extract the
recipient addresses from the message header, and hand the message
over to the MSA.
For example, a wrapper script for qmail would be:
.VS
#!/bin/sh
exec qmail-inject  # ignore command line arguments
VE
The requirement to parse the recipient addresses out of the message header 
is likely to be removed in the future.
Then mmh would pass the recipient addresses as command line arguments.
This appears to be the better interface.
.\" XXX implement it
.P
To retrieve mail, the
.Pn inc
command acted as an MRA.
It established network connections
and spoke POP3 to retrieve mail from remote servers.
As with mail submission, the network connections required encryption and
authentication, thus TLS and SASL were added.
Support for message retrieval through IMAP will soon become necessary
additions, too, and likewise for any other changes in mail transfer.
Not so for mmh because it has dropped the support for retrieving mail
from remote locations.
.Ci ab7b48411962d26439f92f35ed084d3d6275459c
Instead, it depends on an external tool to cover this task.
Mmh has two paths for messages to enter mmh's mail storage:
(1) Mail can be incorporated with
.Pn inc
from the system maildrop, or (2) with
.Pn rcvstore
by reading them, one at a time, from the standard input.
.P
With the removal of the MSA and MRA, mmh converted from an all-in-one
mail system to being an MUA only.
Now, of course, mmh depends on third-party software.
An external MSA is required to transfer mail to the outside world;
an external MRA is required to retrieve mail from remote machines.
Excellent implementations of such software exist,
which likely are superior than the internal version.
Additionally, the best suiting programs can be freely chosen.
.P
As it had already been possible to use an external MSA or MRA,
why not keep the internal version for convenience?
.\" XXX ueberleitung
The question whether there is sense in having a fall-back pager in all
the command line tools, for the cases when
.Pn more
or
.Pn less
are not available, appears to be ridiculous.
Of course, MSAs and MRAs are more complex than text pagers
and not necessarily available but still the concept of orthogonal
design holds: ``Write programs that do one thing and do it well.''
.[
mcilroy unix phil
p. 53
.]
.[
mcilroy bstj foreword
.]
Here, this part of the Unix philosophy was applied not only
to the programs but to the project itself.
In other words:
Develop projects that focus on one thing and do it well.
Projects which have grown complex should be split, for the same
reasons that programs which have grown complex should be split.
If it is conceptionally more elegant to have the MSA and MRA as
separate projects then they should be separated.
In my opinion, this is the case here.
The RFCs propose this separation by clearly distinguishing the different
mail handling tasks.
.[
rfc 821
.]
The small interfaces between the mail agents support the separation.
.P
Email once had been small and simple.
At that time,
.Pn /bin/mail
had covered everything there was to email and still was small and simple.
Later, the essential complexity of email increased.
(Essential complexity is the complexity defined by the problem itself.\0
.[[
brooks no silver bullet
.]])
Email systems reacted to this change: they grew.
RFCs started to introduce the concept of mail agents to separate the
various tasks because they became more extensive and new tasks appeared.
As the mail systems grew even more, parts were split off.
For instance, a POP server was included in the original MH;
it was removed in nmh.
Now is the time to go one step further and split off the MSA and MRA, too.
Not only does this decrease the code size of the project,
more importantly, it unburdens mmh of the whole field of
message transfer with all its implications for the project.
There is no more need for concern with changes in network transfer.
This independence is gained by depending on an external program
that covers the field.
Today, this is a reasonable exchange.
.P
.\" XXX ueberleitung ???
Functionality can be added in three different ways:
.LI 1
Implementing the function in the project itself.
.LI 2
Depending on a library that provides the function.
.LI 3
Depending on a program that provides the function.
.LP
.\" XXX Rework sentence
While implementing the function in the project itself leads to the
largest increase in code size and requires the most maintenance
and development work,
it increases the project's independence of other software the most.
Using libraries or external programs requires less maintenance work
but introduces dependencies on external software.
Programs have the smallest interfaces and provide the best separation,
but possibly limit the information exchange.
External libraries are more strongly connected than external programs,
thus information can be exchanged in a more flexible manner.
Adding code to a project increases maintenance work.
.\" XXX ref
Implementing complex functions in the project itself adds
a lot of code.
This should be avoided if possible.
Hence, the dependencies only change in their character,
not in their existence.
In mmh, library dependencies on
.Pn libsasl2
and
.Pn libcrypto /\c
.Pn libssl
were traded against program dependencies on an MSA and an MRA.
This also meant trading build-time dependencies against run-time
dependencies.
Besides providing stronger separation and greater flexibility,
program dependencies also allowed
over 6\|000 lines of code to be removed from mmh.
This made mmh's code base about 12\|% smaller.
Reducing the project's code size by such an amount without actually
losing functionality is a convincing argument.
Actually, as external MSAs and MRAs are likely superior to the
project's internal versions, the common user even gains functionality.
.P
Users of MH should not have problems setting up an external MSA and MRA.
Also, the popular MSAs and MRAs have large communities and a lot
of available documentation.
Choices for MSAs range from full-featured MTAs such as
.\" XXX refs
.I Postfix ,
over mid-size MTAs such as
.I masqmail
and
.I dma ,
to small forwarders such as
.I ssmtp
and
.I nullmailer .
Choices for MRAs include
.I fetchmail ,
.I getmail ,
.I mpop
and
.I fdm .


.H2 "Non-MUA Tools
.P
One goal of mmh is to remove the tools that are not part of the MUA's task.
Furthermore, any tools that do not significantly improve the MUA's job
should be removed.
Loosely related and rarely used tools distract from the lean appearance.
They require maintenance work without adding much to the core task.
By removing these tools, the project shall become more streamlined
and focused.
In mmh, the following tools are not available anymore:
.BU
.Pn conflict
was removed
.Ci 8b235097cbd11d728c07b966cf131aa7133ce5a9
because it is a mail system maintenance tool that is not MUA-related.
It even checked
.Fn /etc/passwd
and
.Fn /etc/group
for consistency, which is completely unrelated to email.
A tool like
.Pn conflict
is surely useful, but it should not be shipped with mmh.
.\" XXX historic reasons?
.BU
.Pn rcvtty
was removed
.Ci 14767c94b3827be7c867196467ed7aea5f6f49b0
because its use case of writing to the user's terminal
on receival of mail is obsolete.
If users like to be informed of new mail, the shell's
.Ev MAILPATH
variable or graphical notifications are technically more appealing.
Writing directly to terminals is hardly ever desired today.
If, though, one prefers this approach, the standard tool
.Pn write
can be used in a way similar to:
.VS
scan -file - | write `id -un`
VE
.BU
.Pn viamail
.\" XXX was macht viamail
was removed
.Ci eda72d6a7a7c20ff123043fb7f19c509ea01f932
when the new attachment system was activated, because
.Pn forw
could then cover the task itself.
The program
.Pn sendfiles
was rewritten as a shell script wrapper around
.Pn forw .
.Ci 0e82199cf3c991a173e0ac8aa776efdb3ded61e6
.BU
.Pn msgchk
.\" XXX was macht msgchk
was removed
.Ci bb9360ead7eb7a3fedcce2eeedfc660014e41dbe ,
because it lost its use case when POP support was removed.
A call to
.Pn msgchk
provided hardly more information than:
.VS
ls -l /var/mail/meillo
VE
It did distinguish between old and new mail, but
these details can be retrieved with
.Pn stat (1),
too.
A small shell script could be written to print the information
in a similar way, if truly necessary.
As mmh's
.Pn inc
only incorporates mail from the user's local maildrop,
and thus no data transfers over slow networks are involved,
there is hardly any need to check for new mail before incorporating it.
.BU
.Pn msh
was removed
.Ci 916690191222433a6923a4be54b0d8f6ac01bd02
because the tool was in conflict with the philosophy of MH.
It provided an interactive shell to access the features of MH,
but it was not just a shell tailored to the needs of mail handling.
Instead, it was one large program that had several MH tools built in.
This conflicts with the major feature of MH of being a tool chest.
.Pn msh 's
main use case had been accessing Bulletin Boards, which have ceased to
be popular.
.P
Removing
.Pn msh
together with the truly archaic code relicts
.Pn vmh
and
.Pn wmh
saved more than 7\|000 lines of C code \(en
about 15\|% of the project's original source code amount.
Having less code \(en with equal readability, of course \(en
for the same functionality is an advantage.
Less code means less bugs and less maintenance work.
As
.Pn rcvtty
and
.Pn msgchk
are assumed to be rarely used and can be implemented in different ways,
why should one keep them?
Removing them streamlines mmh.
.Pn viamail 's
use case is now partly obsolete and partly covered by
.Pn forw ,
hence there's no reason to still maintain it.
.Pn conflict
is not related to the mail client, and
.Pn msh
conflicts with the basic concept of MH.
These two tools might still be useful, but they should not be part of mmh.
.P
Finally, there is
.Pn slocal .
.Pn slocal
is an MDA and thus not directly MUA-related.
It should be removed from mmh, because including it conflicts with
the idea that mmh is an MUA only.
.Pn slocal
should rather become a separate project.
However,
.Pn slocal
provides rule-based processing of messages, like filing them into
different folders, which is otherwise not available in mmh.
Although
.Pn slocal
neither pulls in dependencies, nor does it include a separate
technical area (cf. Sec.
.Cf mail-transfer-facilities ),
it still accounts for about 1\|000 lines of code that need to be maintained.
As
.Pn slocal
is almost self-standing, it should be split off into a separate project.
This would cut the strong connection between the MUA mmh and the MDA
.Pn slocal .
For anyone not using MH,
.Pn slocal
would become yet another independent MDA, like
.I procmail .
Then
.Pn slocal
could be installed without the complete MH system.
Likewise, mmh users could decide to use
.I procmail
without having a second, unused MDA,
.Pn slocal ,
installed.
That appears to be conceptionally the best solution.
Yet,
.Pn slocal
is not split off.
I defer the decision over
.Pn slocal
out of a need for deeper investigation.
In the meanwhile, it remains part of mmh.
However, its continued existence is not significant because
.Pn slocal
is unrelated to the rest of the project.



.H2 "Displaying Messages
.Id mhshow
.P
Since the very beginning, already in the first concept paper,
.\" XXX ref!!!
.Pn show
had been MH's message display program.
.Pn show
mapped message numbers and sequences to files and invoked
.Pn mhl
to have the files formatted.
With MIME, this approach was not sufficient anymore.
MIME messages can consist of multiple parts. Some parts are not
directly displayable and text content might be encoded in
foreign charsets.
.Pn show 's
understanding of messages and
.Pn mhl 's
display capabilities could not cope with the task any longer.
.P
Instead of extending these tools, additional tools were written from
scratch and added to the MH tool chest.
Doing so is encouraged by the tool chest approach.
Modular design is a great advantage for extending a system,
as new tools can be added without interfering with existing ones.
First, the new MIME features were added in form of the single program
.Pn mhn .
The command
.Cl "mhn -show 42
would show the MIME message numbered 42.
With the 1.0 release of nmh in February 1999, Richard Coleman finished
the split of
.Pn mhn
into a set of specialized tools, which together covered the
multiple aspects of MIME.
One of them was
.Pn mhshow ,
which replaced
.Cl "mhn -show" .
It was capable of displaying MIME messages appropriately.
.P
From then on, two message display tools were part of nmh,
.Pn show
and
.Pn mhshow .
To ease the life of users,
.Pn show
was extended to automatically hand the job over to
.Pn mhshow
if displaying the message would be beyond
.Pn show 's
abilities.
In consequence, the user would simply invoke
.Pn show
(possibly through
.Pn next
or
.Pn prev )
and get the message printed with either
.Pn show
or
.Pn mhshow ,
whatever was more appropriate.
.P
Having two similar tools for essentially the same task is redundant.
Usually, users would not distinguish between
.Pn show
and
.Pn mhshow
in their daily mail reading.
Having two separate display programs was therefore mainly unnecessary
from a user's point of view.
Besides, the development of both programs needed to be in sync,
to ensure that the programs behaved in a similar way,
because they were used like a single tool.
Different behavior would have surprised the user.
.P
Today, non-MIME messages are rather seen to be a special case of
MIME messages, although it is the other way round.
As
.Pn mhshow
had already been able to display non-MIME messages, it appeared natural
to drop
.Pn show
in favor of using
.Pn mhshow
exclusively.
.Ci 4c1efddfd499300c7e74263e57d8aa137e84c853
Removing
.Pn show
is no loss in function, because functionally
.Pn mhshow
covers it completely.
The old behavior of
.Pn show
can still be emulated with the simple command line:
.VS
mhl `mhpath c`
VE
.P
For convenience,
.Pn mhshow
was renamed to
.Pn show
after
.Pn show
was gone.
It is clear that such a rename may confuse future developers when
trying to understand the history.
Nevertheless, I consider the convenience on the user's side,
to call
.Pn show
when they want a message to be displayed, to outweigh the inconvenience
on the developer's side when understanding the project history.
.P
To prepare for the transition,
.Pn mhshow
was reworked to behave more like
.Pn show
first.
(cf. Sec.
.Cf mhshow )
.\" XXX code commits?
Once the tools behaved more alike, the replacing appeared to be
even more natural.
Today, mmh's new
.Pn show
has become the one single message display program once more,
with the difference
that today it handles MIME messages as well as non-MIME messages.
The outcome of the transition is one program less to maintain,
no second display program for users to deal with,
and less system complexity.
.P
Still, removing the old
.Pn show
hurts in one regard: It had been such a simple program.
Its lean elegance is missing from the new
.Pn show ,
.\" XXX
however there is no alternative;
supporting MIME demands higher essential complexity.

.ig
XXX
Consider including text on scan listings here

Scan listings shall not contain body content. Hence, removed this feature.
Scan listings shall operator on message headers and non-message information
only. Displaying the beginning of the body complicates everything too much.
That's no surprise, because it's something completely different. If you
want to examine the body, then use show(1)/mhshow(1).
Changed the default scan formats accordingly.
.Ci 70b2643e0da8485174480c644ad9785c84f5bff4
..




.H2 "Configure Options
.P
Customization is a double-edged sword.
It allows better suiting setups, but not for free.
There is the cost of code complexity to be able to customize.
There is the cost of less tested setups, because there are
more possible setups and especially corner cases.
Additionally, there is the cost of choice itself.
The code complexity directly affects the developers.
Less tested code affects both users and developers.
The problem of choice affects the users, for once by having to choose,
but also by more complex interfaces that require more documentation.
Whenever options add few advantages but increase the complexity of the
system, they should be considered for removal.
I have reduced the number of project-specific configure options from 
fifteen to three.

.U3 "Mail Transfer Facilities
.P
With the removal of the mail transfer facilities five configure
options vanished:
.P
The switches
.Sw --with-tls
and
.Sw --with-cyrus-sasl
had activated the support for transfer encryption and authentication.
.\" XXX cf
.\" XXX gruende kurz wiederholen
This is not needed anymore.
.Ci fecd5d34f65597a4dfa16aeabea7d74b191532c3
.Ci 156d35f6425bea4c1ed3c4c79783dc613379c65b
.P
.\" XXX cf
.\" XXX ``For the same reason ...''
The configure switch
.Sw --enable-pop
activated the message retrieval facility.
The code area that would be conditionally compiled in for TLS and SASL
support had been small.
The conditionally compiled code area for POP support had been much larger.
Whereas the code base changes would only slightly change on toggling
TLS or SASL support, it changed much on toggling POP support.
The changes in the code base could hardly be overviewed.
By having POP support togglable, a second code base had been created,
one that needed to be tested.
This situation is basically similar for the conditional TLS and SASL  
code, but there the changes are minor and can yet be overviewed.
Still, conditional compilation of a code base creates variations
of the original program.
More variations require more testing and maintenance work.
.P
Two other options only specified default configuration values:
.Sw --with-mts
defined the default transport service.
.Ci f6aa95b724fd8c791164abe7ee5468bf5c34f226
With
.Sw --with-smtpservers
default SMTP servers could be specified.
.Ci 128545e06224233b7e91fc4c83f8830252fe16c9
Both of them became irrelevant when the SMTP transport service was removed.
.\" XXX code ref
In mmh, all messages are handed over to
.Pn sendmail
for transportation.


.U3 "Backup Prefix
.P
The backup prefix is the string that was prepended to message
filenames to tag them as deleted.
By default it had been the comma character (`\fL,\fP').
.\" XXX Zeitlich ordnen
In July 2000, Kimmo Suominen introduced
the configure option
.Sw --with-hash-backup
to change the default to the hash character `\f(CW#\fP'.
The choice was probably personal preference, because first, the
option was named
.Sw --with-backup-prefix.
and had the prefix character as argument.
But giving the hash character as argument caused too many problems
for Autoconf,
thus the option was limited to use the hash character as the default prefix.
This supports the assumption, that the choice for the hash was
personal preference only.
Being related or not, words that start with the hash character
introduce a comment in the Unix shell.
Thus, the command line
.Cl "rm #13 #15
calls
.Pn rm
without arguments because the first hash character starts the comment
that reaches until the end of the line.
To delete the backup files,
.Cl "rm ./#13 ./#15"
needs to be used.
Using the hash as backup prefix can be seen as a precaution against
data loss.
.P
First, I removed the configure option but added the profile entry
.Pe backup-prefix ,
which allows to specify an arbitrary string as backup prefix.
.Ci 6c40d481d661d532dd527eaf34cebb6d3f8ed086
Profile entries are the common method to change mmh's behavior.
This change did not remove the choice but moved it to a location where
it suited better.
.P
Eventually, however, the new trash folder concept
(cf. Sec.
.Cf trash-folder )
removed the need for the backup prefix completely.
.Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173
.Ci ca0b3e830b86700d9e5e31b1784de2bdcaf58fc5


.U3 "Editor and Pager
.P
The two configure options
.CW --with-editor=EDITOR
.CW --with-pager=PAGER
were used to specify the default editor and pager at configure time.
Doing so at configure time made sense in the eighties,
when the set of available editors and pagers varied much across
different systems.
Today, the situation is more homogeneous.
The programs
.Pn vi
and
.Pn more
can be expected to be available on every Unix system,
as they are specified by POSIX since two decades.
(The specifications for
.Pn vi
and
.Pn more
appeared in
.[
posix 1987
.]
and,
.[
posix 1992
.]
respectively.)
As a first step, these two tools were hard-coded as defaults.
.Ci 5d43a99db70c12a673028c7758c20cbe3e13ef5f
Not changed were the
.Pe editor
and
.Pe moreproc
profile entries, which allowed the user to override the system defaults.
Later, the concept was reworked to respect the standard environment
variables
.Ev VISUAL
and
.Ev PAGER
if they are set.
Today, mmh determines the editor to use in the following order,
taking the first available and non-empty item:
.LI 1
Environment variable
.Ev MMHEDITOR
.LI 2
Profile entry
.Pe Editor
.LI 3
Environment variable
.Ev VISUAL
.LI 4
Environment variable
.Ev EDITOR
.LI 5
Command
.Pn vi .
.LP
.Ci f85f4b7ae62e3d05a945dcd46ead51f0a2a89a9b
.P
The pager to use is determined in a similar order,
also taking the first available and non-empty item:
.LI 1
Environment variable
.Ev MMHPAGER
.LI 2
Profile entry
.Pe Pager
(replaces
.Pe moreproc )
.LI 3
Environment variable
.Ev PAGER
.LI 4
Command
.Pn more .
.LP
.Ci 0c4214ea2aec6497d0d67b436bbee9bc1d225f1e
.P
By respecting the
.Ev VISUAL /\c
.Ev EDITOR
and
.Ev PAGER
environment variables,
the new behavior confirms better to the common style on Unix systems.
Additionally, the new approach is more uniform and clearer to users.


.U3 "ndbm
.P
.Pn slocal
used to depend on
.I ndbm ,
a database library.
The database is used to store the `\fLMessage-ID\fP's of all
messages delivered.
This enables
.Pn slocal
to suppress delivering the same message to the same user twice.
(This features was enabled by the
.Sw -suppressdup
switch.)
.P
A variety of versions of the database library exist.
.[
wolter unix incompat notes dbm
.]
Complicated autoconf code was needed to detect them correctly.
Furthermore, the configure switches
.Sw --with-ndbm=ARG
and
.Sw --with-ndbmheader=ARG
were added to help with difficult setups that would
not be detected automatically or correctly.
.P
By removing the suppress duplicates feature of
.Pn slocal ,
the dependency on
.I ndbm
vanished and 120 lines of complex autoconf code could be saved.
.Ci ecd6d6a20cb7a1507e3a20d6c4cb3a1cf14c6bbf
The change removed functionality too, but that is minor to the
improvement by dropping the dependency and the complex autoconf code.
.\" XXX argument: slocal ist sowieso nicht teil vom mmh kern

.U3 "mh-e Support
.P
The configure option
.Sw --disable-mhe
was removed when the mh-e support was reworked. 
Mh-e is the Emacs front-end to MH.
It requires MH to provide minor additional functions.
The
.Sw --disable-mhe
configure option could switch these extensions off.
After removing the support for old versions of mh-e,
only the
.Sw -build
switches of
.Pn forw
and
.Pn repl
are left to be mh-e extensions.
They are now always built in because they add little code and complexity.
In consequence, the
.Sw --disable-mhe
configure option was removed
.Ci a7ce7b4a580d77b6c2c4d980812beb589aa4c643
Removing the option removed a second code setup that would have
needed to be tested.
.\" XXX datum?
This change was first accomplished in nmh and thereafter merged into mmh.
.P
The interface changes in mmh require mh-e to be adjusted in order
to be able to use mmh as back-end.
This will require minor changes to mh-e, but removing the
.Sw -build
switches would require more rework.

.U3 "Masquerading
.P
The configure option
.Sw --enable-masquerade
could take up to three arguments:
`draft_from', `mmailid', and `username_extension'.
They activated different types of address masquerading.
All of them were implemented in the SMTP-speaking
.Pn post
command, which provided an MSA.
Address masquerading is an MTA's task and mmh does not cover
this field anymore.
Hence, true masquerading needs to be implemented in the external MTA.
.P
The
.I mmailid
masquerading type is the oldest one of the three and the only one
available in the original MH.
It provided a
.I username
to
.I fakeusername
mapping, based on the password file's GECOS field.
The man page
.Mp mh-tailor (5)
described the use case as being the following:
.QS
This is useful if you want the messages you send to always
appear to come from the name of an MTA alias rather than your
actual account name.  For instance, many organizations set up
`First.Last' sendmail aliases for all users.  If this is
the case, the GECOS field for each user should look like:
``First [Middle] Last <First.Last>''
.QE
.P
As mmh sends outgoing mail via the local MTA only,
the best location to do such global rewrites is there.
Besides, the MTA is conceptionally the right location because it
does the reverse mapping for incoming mail (aliasing), too.
Furthermore, masquerading set up there is readily available for all
mail software on the system.
Hence, mmailid masquerading was removed.
.Ci 0836c8000ccb34b59410ef1c15b1b7feac70ce5f
.P
The
.I username_extension
masquerading type did not replace the username but would append a suffix,
specified by the
.Ev USERNAME_EXTENSION
environment variable, to it.
This provided support for the
.I user-extension
feature of qmail and the similar
.I "plussed user
processing of sendmail.
The decision to remove this username_extension masquerading was
motivated by the fact that
.Pn spost
had not supported it already.
.Ci 2abae0bfd0ad5bf898461e50aa4b466d641f23d9
Username extensions are possible in mmh, but less convenient to use.
.\" XXX covered by next paragraph
.\" XXX format file %(getenv USERNAME_EXTENSION)
.P
The
.I draft_from
masquerading type instructed
.Pn post
to use the value of the
.Hd From
header field as SMTP envelope sender.
Sender addresses could be replaced completely.
.Ci b14ea6073f77b4359aaf3fddd0e105989db9
Mmh offers a kind of masquerading similar in effect, but
with technical differences.
As mmh does not transfer messages itself, the local MTA has final control
over the sender's address. Any masquerading mmh introduces may be reverted
by the MTA.
In times of pedantic spam checking, an MTA will take care to use
sensible envelope sender addresses to keep its own reputation up.
Nonetheless, the MUA can set the
.Hd From
header field and thereby propose
a sender address to the MTA.
The MTA may then decide to take that one or generate the canonical sender
address for use as envelope sender address.
.P
In mmh, the MTA will always extract the recipient and sender from the
message header (\c
.Pn sendmail 's
.Sw -t
switch).
The
.Hd From
header field of the draft may be set arbitrary by the user.
If it is missing, the canonical sender address will be generated by the MTA.

.U3 "Remaining Options
.P
Two configure options remain in mmh.
One is the locking method to use:
.Sw --with-locking=[dot|fcntl|flock|lockf] .
The idea of removing all methods except the portable dot locking
and having that one as the default is appealing, but this change
requires deeper technical investigation into the topic.
The other option,
.Sw --enable-debug ,
compiles the programs with debugging symbols and does not strip them.
This option is likely to stay.




.H2 "Command Line Switches
.P
The command line switches of MH tools is similar to the X Window style.
.\" XXX ref
They are words, introduced by a single dash.
For example:
.Cl "-truncate" .
Every program in mmh has two generic switches:
.Sw -help ,
to print a short message on how to use the program, and 
.Sw -Version
(with capital `V'), to tell what version of mmh the program belongs to.
.P
Switches change the behavior of programs.
Programs that do one thing in one way require no switches.
In most cases, doing something in exactly one way is too limiting.
If there is basically one task to accomplish, but it should be done
in various ways, switches are a good approach to alter the behavior
of a program.
Changing the behavior of programs provides flexibility and customization
to users, but at the same time it complicates the code, documentation and
usage of the program.
.\" XXX: Ref
Therefore, the number of switches should be kept small.
A small set of well-chosen switches does no harm.
But usually, the number of switches increases over time.
Already in 1985, Rose and Romine have identified this as a major
problem of MH:
.[ [
rose romine real work
.], p. 12]
.QS
A complaint often heard about systems which undergo substantial development
by many people over a number of years, is that more and more options are
introduced which add little to the functionality but greatly increase the
amount of information a user needs to know in order to get useful work done.
This is usually referred to as creeping featurism.
.QP
Unfortunately MH, having undergone six years of off-and-on development by
ten or so well-meaning programmers (the present authors included),
suffers mightily from this.
.QE
.P
Being reluctant to adding new switches \(en or `options',
as Rose and Romine call them \(en is one part of a counter-action,
the other part is removing hardly used switches.
Nmh's tools had lots of switches already implemented,
hence, cleaning up by removing some of them was the more important part
of the counter-action.
Removing existing functionality is always difficult because it
breaks programs that use these functions.
Also, for every obsolete feature, there'll always be someone who still
uses it and thus opposes its removal.
This puts the developer into the position,
where sensible improvements to style are regarded as destructive acts.
Yet, living with the featurism is far worse, in my eyes, because
future needs will demand adding further features,
worsening the situation more and more.
Rose and Romine added in a footnote,
``[...]
.Pn send
will no doubt acquire an endless number of switches in the years to come.''
Although clearly humorous, the comment points to the nature of the problem.
Refusing to add any new switches would encounter the problem at its root,
but this is not practical.
New needs will require new switches and it would be unwise to block
them strictly.
Nevertheless, removing obsolete switches still is an effective approach
to deal with the problem.
Working on an experimental branch without an established user base,
eased my work because I did not offend users when I removed existing
functions.
.P
Rose and Romine counted 24 visible and 9 more hidden switches for
.Pn send .
In nmh, they increased up to 32 visible and 12 hidden ones.
At the time of writing, no more than 4 visible switches and 1 hidden switch
have remained in mmh's
.Pn send .
These numbers include two generic switches,
.Sw -help
and
.Sw -Version .
Hidden switches are ones not documented.
In mmh, 12 tools have hidden switches.
9 of them are
.Sw -debug
switches, the other 6 provide special interfaces for internal use.
.P
The figure displays the number of switches for each of the tools
that is available in both nmh and mmh.
The tools are sorted by the number of switches they had in nmh.
Visible and hidden switches were counted,
but not the generic help and version switches.
Whereas in the beginning of the project, the average tool had 11 switches,
now it has no more than 5 \(en only half as many.
If the `no' switches and similar inverse variant are folded onto
their counter-parts, the average tool had 8 switches in pre-mmh times and
has 4 now.
The total number of functional switches in mmh dropped from 465
to 233.

.KS
.in 1c
.so input/switches.grap
.KE

.P
A part of the switches vanished after functions were removed.
This was the case for network mail transfer, for instance.
Sometimes, however, the work flow was the other way:
I looked through the
.Mp mh-chart (7)
man page to identify the tools with apparently too many switches.
Then considering the value of each of the switches by examining
the tool's man page and source code, aided by recherche and testing.
This way, the removal of functions was suggested by the aim to reduce
the number of switches per command.


.U3 "Draft Folder Facility
.P
A change early in the project was the complete transition from
the single draft message to the draft folder facility.
.Ci 337338b404931f06f0db2119c9e145e8ca5a9860
.\" XXX ref to section ...
The draft folder facility was introduced in the mid-eighties, when
Rose and Romine called it a ``relatively new feature''.
.[
rose romine real work
.]
Since then, the facility had existed but was inactive by default.
The default activation and the related rework of the tools made it
possible to remove the
.Sw -[no]draftfolder ,
and
.Sw -draftmessage
switches from
.Pn comp ,
.Pn repl ,
.Pn forw ,
.Pn dist ,
.Pn whatnow ,
and
.Pn send .
.Ci 337338b404931f06f0db2119c9e145e8ca5a9860
The only flexibility removed with this change is having multiple
draft folders within one profile.
I consider this a theoretical problem only.
At the same time, the
.Sw -draft
switch of
.Pn anno ,
.Pn refile ,
and
.Pn send
was removed.
The special treatment of \fIthe\fP draft message became irrelevant after
the rework of the draft system.
(cf. Sec.
.Cf draft-folder )
Furthermore,
.Pn comp
no longer needs a
.Sw -file
switch as the draft folder facility together with the
.Sw -form
switch are sufficient.


.U3 "In Place Editing
.P
.Pn anno
had the switches
.Sw -[no]inplace
to either annotate the message in place and thus preserve hard links,
or annotate a copy to replace the original message, breaking hard links.
Following the assumption that linked messages should truly be the
same message, and annotating it should not break the link, the
.Sw -[no]inplace
switches were removed and the previous default
.Sw -inplace
was made the only behavior.
.Ci c8195849d2e366c569271abb0f5f60f4ebf0b4d0
The
.Sw -[no]inplace
switches of
.Pn repl ,
.Pn forw ,
and
.Pn dist
could be removed, too, as they were simply passed through to
.Pn anno .
.P
.Pn burst
also had
.Sw -[no]inplace
switches, but with different meaning.
With
.Sw -inplace ,
the digest had been replaced by the table of contents (i.e. the
introduction text) and the burst messages were placed right
after this message, renumbering all following messages.
Also, any trailing text of the digest was lost, though,
in practice, it usually consists of an end-of-digest marker only.
Nontheless, this behavior appeared less elegant than the
.Sw -noinplace
behavior, which already had been the default.
Nmh's
.Mp burst (1)
man page reads:
.QS
If
.Sw -noinplace
is given, each digest is preserved, no table
of contents is produced, and the messages contained within
the digest are placed at the end of the folder. Other messages
are not tampered with in any way.
.QE
.LP
The decision to drop the
.Sw -inplace
behavior was supported by the code complexity and the possible data loss
it caused.
.Sw -noinplace
was chosen to be the definitive behavior.
.Ci 68a686adeb39223a5e1ad35e4a24890ec053679d


.U3 "Forms and Format Strings
.P
Historically, the tools that had
.Sw -form
switches to supply a form file had
.Sw -format
switches as well to supply the contents of a form file as a string
on the command line directly.
In consequence, the following two lines equaled:
.VS
scan -form scan.mailx
scan -format "`cat .../scan.mailx`"
VE
The
.Sw -format
switches were dropped in favor for extending the
.Sw -form
switches.
.Ci f51956be123db66b00138f80464d06f030dbb88d
If their argument starts with an equal sign (`='),
then the rest of the argument is taken as a format string,
otherwise the arguments is treated as the name of a format file.
Thus, now the following two lines equal:
.VS
scan -form scan.mailx
scan -form "=`cat .../scan.mailx`"
VE
This rework removed the prefix collision between
.Sw -form
and
.Sw -format .
Now, typing
.Sw -fo
suffices to specify form or format string.
.P
The different meaning of
.Sw -format
for
.Pn repl
and
.Pn forw
was removed in mmh.
.Pn forw
was completely switched to MIME-type forwarding, thus removing the
.Sw -[no]format .
.Ci 6e271608b7b9c23771523f88d23a4d3593010cf1
For
.Pn repl ,
the
.Sw -[no]format
switches were reworked to
.Sw -[no]filter
switches.
.Ci 67411b1f95d6ec987b4c732459e1ba8a8ac192c6
The
.Sw -format
switches of
.Pn send
and
.Pn post ,
which had a third meaning,
were removed likewise.
.Ci f3cb7cde0e6f10451b6848678d95860d512224b9
Eventually, the ambiguity of the
.Sw -format
switches was resolved by not anymore having any such switch in mmh.


.U3 "MIME Tools
.P
The MIME tools, which were once part of
.Pn mhn
.\" XXX
(whatever that stood for),
had several switches that added little practical value to the programs.
The
.Sw -[no]realsize
switches of
.Pn mhbuild
and
.Pn mhlist
were removed, doing real size calculations always now
.Ci 8d8f1c3abc586c005c904e52c4adbfe694d2201c ,
as nmh's
.Mp mhbuild (1)
man page states
``This provides an accurate count at the expense of a small delay.''
This small delay is not noticable on modern systems.
.P
The
.Sw -[no]check
switches were removed together with the support for
.Hd Content-MD5
header fields.
.[
rfc 1864
.]
.Ci 31dc797eb5178970d68962ca8939da3fd9a8efda
(cf. Sec.
.Cf content-md5 )
.P
The
.Sw -[no]ebcdicsafe
and
.Sw -[no]rfc934mode
switches of
.Pn mhbuild
were removed because they are considered obsolete.
.Ci 01a3480928da485b4d6109d36d751dfa71799d58
.Ci 3363e2624dce0eb8164cf8b3f1ab385c8ff72e88
.P
Content caching of external MIME parts, activated with the
.Sw -rcache
and
.Sw -wcache
switches was completely removed.
.Ci d1fefd9f614e4dc3cda16da6c69133c1b2005269
External MIME parts are rare today, having a caching facility
for them appears to be unnecessary.
.P
In pre-MIME times,
.Pn mhl
had covered many tasks that are part of MIME handling today.
Therefore,
.Pn mhl
could be simplified to a large extend, reducing the number of its
switches from 21 to 6.
.Ci 350ad6d3542a07639213cf2a4fe524e829c1e7b6
.Ci 0e46503be3c855bddaeae3843e1b659279c35d70




.U3 "Header Printing
.P
.Pn folder 's
data output is self-explaining enough that
displaying the header line makes little sense.
Hence, the
.Sw -[no]header
switch was removed and headers are never printed.
.Ci 601cc73d1fa05ce96faa728f036d6c51b91701c7
.P
In
.Pn mhlist ,
the
.Sw -[no]header
switches were removed, too.
.Ci b24f96523aaf60e44e04a3ffb1d22e69a13a602f
But in this case headers are always printed,
because the output is not self-explaining.
.P
.Pn scan
also had
.Sw -[no]header
switches.
Printing the header had been sensible until the introduction of
format strings made it impossible to display the column headings.
Only the folder name and the current date remained to be printed.
As this information can be perfectly retrieved by
.Pn folder
and
.Pn date ,
consequently, the switches were removed.
.Ci c477dc5d1d03fa6d9a8ab3dd3508c63cbddc044e
.P
By removing all
.Sw -header
switches, the collision with
.Sw -help
on the first two letters was resolved.
Currently,
.Sw -h
evaluates to
.Sw -help
for all tools of mmh.


.U3 "Suppressing Edits or the Invocation of the WhatNow Shell
.P
The
.Sw -noedit
switch of
.Pn comp ,
.Pn repl ,
.Pn forw ,
.Pn dist ,
and
.Pn whatnow
was removed, but it can now be replaced by specifying
.Sw -editor
with an empty argument.
.Ci 75fca31a5b9d5c1a99c74ab14c94438d8852fba9
(Specifying
.Cl "-editor /bin/true
is nearly the same, only differing by the previous editor being set.)
.P
The more important change is the removal of the
.Sw -nowhatnowproc
switch.
.Ci ee4f43cf2ef0084ec698e4e87159a94c01940622
This switch had introduced an awkward behavior, as explained in nmh's
man page for
.Mp comp (1):
.QS
The
.Sw -editor
.Ar editor
switch indicates the editor to use for
the initial edit. Upon exiting from the editor,
.Pn comp
will invoke the
.Pn whatnow
program. See
.Mp whatnow (1)
for a discussion of available options.
The invocation of this program can be
inhibited by using the
.Sw -nowhatnowproc
switch. (In truth of fact, it is the
.Pn whatnow
program which starts the initial edit.
Hence,
.Sw -nowhatnowproc
will prevent any edit from occurring.)
.QE
.P
Effectively, the
.Sw -nowhatnowproc
switch creates only a draft message.
As
.Cl "-whatnowproc /bin/true
causes the same behavior, the
.Sw -nowhatnowproc
switch was removed for being redundant.
Likely, the
.Sw -nowhatnowproc
switch was intended to be used by front-ends.



.U3 "Various
.BU
With the removal of MMDF maildrop format support,
.Pn packf
and
.Pn rcvpack
no longer needed their
.Sw -mbox
and
.Sw -mmdf
switches.
.Sw -mbox
is the sole behavior now.
.Ci 3916ab66ad5d183705ac12357621ea8661afd3c0
Further rework in both tools made the
.Sw -file
switch unnecessary.
.Ci ca1023716d4c2ab890696f3e41fa0d94267a940e

.BU
Mmh's tools will no longer clear the screen (\c
.Pn scan 's
and
.Pn mhl 's
.Sw -[no]clear
switches
.Ci e57b17343dcb3ff373ef4dd089fbe778f0c7c270
.Ci 943765e7ac5693ae177fd8d2b5a2440e53ce816e ).
Neither will
.Pn mhl
ring the bell (\c
.Sw -[no]bell
.Ci e11983f44e59d8de236affa5b0d0d3067c192e24 )
nor page the output itself (\c
.Sw -length
.Ci 5b9d883db0318ed2b84bb82dee880d7381f99188 ).
.\" XXX Ref
Generally, the pager to use is no longer specified with the
.Sw -[no]moreproc
command line switches for
.Pn mhl
and
.Pn show /\c
.Pn mhshow .
.Ci 39e87a75b5c2d3572ec72e717720b44af291e88a

.BU
In order to avoid prefix collisions among switch names, the
.Sw -version
switch was renamed to
.Sw -Version
(with capital `V').
.Ci 32b2354dbaf4bf934936eb5b102a4a3d2fdd209a
Every program has the
.Sw -version
switch but its first three letters collided with the
.Sw -verbose
switch, present in many programs.
The rename solved this problem once for all.
Although this rename breaks a basic interface, having the
.Sw -V
abbreviation to display the version information, isn't all too bad.

.BU
.Sw -[no]preserve
of
.Pn refile
was removed
.Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173
because what use was it anyway?
Quoting nmh's man page
.Mp refile (1):
.QS
Normally when a message is refiled, for each destination
folder it is assigned the number which is one above the current
highest message number in that folder. Use of the
.Sw -preserv
[sic!] switch will override this message renaming, and try
to preserve the number of the message. If a conflict for a
particular folder occurs when using the
.Sw -preserve
switch, then
.Pn refile
will use the next available message number which
is above the message number you wish to preserve.
.QE

.BU
The removal of the
.Sw -[no]reverse
switches of
.Pn scan
.Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173
is a bug fix, supported by the comments
``\-[no]reverse under #ifdef BERK (I really HATE this)''
by Rose and
``Lists messages in reverse order with the `\-reverse' switch.
This should be considered a bug.'' by Romine in the documentation.
.\" XXX Ref: welche datei genau.
The question remains why neither Rose and Romine had fixed this
bug in the eighties when they wrote these comments nor has anyone
thereafter.


.ig

forw: [no]dashstuffing(mhl)

mhshow: [no]pause [no]serialonly

mhmail: resent queued
inc: snoop, (pop)

mhl: [no]faceproc folder sleep
	[no]dashstuffing(forw) digest list volume number issue number

prompter: [no]doteof

refile: [no]preserve [no]unlink [no]rmmproc

send: [no]forward [no]mime [no]msgid
	[no]push split [no]unique (sasl) width snoop [no]dashstuffing
	attach attachformat
whatnow: (noedit) attach

slocal: [no]suppressdups

spost: [no]filter [no]backup width [no]push idanno
	[no]check(whom) whom(whom)

whom: ???

..


.ig

.P
In the best case, all switches are unambiguous on the first character,
or on the three-letter prefix for the `no' variants.
Reducing switch prefix collisions, shortens the necessary prefix length
the user must type.
Having less switches helps best.

..


.\" XXX: whatnow prompt commands




.\" --------------------------------------------------------------
.H1 "Modernizing
.P
In the more than thirty years of MH's existence, its code base was
increasingly extended.
New features entered the project and became alternatives to the
existing behavior.
Relicts from several decades have gathered in the code base,
but seldom obsolete features were dropped.
This section describes the removing of old code
and the modernizing of the default setup.
It focuses on the functional aspect only;
the non-functional aspects of code style are discussed in Sec.
.Cf code-style .


.H2 "Code Relicts
.P
My position regarding the removal of obsolete functions of mmh,
.\" XXX ``in order to remove old code,''
is much more revolutional than the nmh community appreciates.
Working on an experimental version, I was quickly able to drop
functionality I considered ancient.
The need for consensus with peers would have slowed this process down.
Without the need to justify my decisions, I was able to rush forward.
In December 2011, Paul Vixie motivated the nmh developers to just
.\" XXX ugs
do the work:
.[
paul vixie edginess nmh-workers
.]
.QS
let's stop walking on egg shells with this code base. there's no need to
discuss whether to keep using vfork, just note in [sic!] passing, [...]
we don't need a separate branch for removing vmh
or ridding ourselves of #ifdef's or removing posix replacement functions
or depending on pure ansi/posix ``libc''.
.QP
these things should each be a day or two of work and the ``main branch''
should just be modern. [...]
let's push forward, aggressively.
.QE
.LP
I did so already in the months before.
I pushed forward.
.\" XXX semicolon ?
I simply dropped the cruft.
.P
The decision to drop a feature was based on literature research and
careful thinking, but whether having had contact with this particular
feature within my own computer life served as a rule of thumb.
I explained my reasons in the commit messages
in the version control system.
Hence, others can comprehend my view and argue for undoing the change
if I have missed an important aspect.
I was quick in dropping parts.
I rather include falsely dropped parts again, than going at a slower pace.
Mmh is experimental work; it requires tough decisions.
.\" XXX ``exp. work'' schon oft gesagt


.U3 "Forking
.P
Being a tool chest, MH creates many processes.
In earlier times
.Fu fork()
had been an expensive system call, because the process's image needed
to be completely duplicated at once.
This expensive work was especially unnecessary in the commonly occuring
case wherein the image is replaced by a call to
.Fu exec()
right after having forked the child process.
The
.Fu vfork()
system call was invented to speed up this particular case.
It completely omits the duplication of the image.
On old systems this resulted in significant speed ups.
Therefore MH used
.Fu vfork()
whenever possible.
.P
Modern memory management units support copy-on-write semantics, which make
.Fu fork()
almost as fast as
.Fu vfork() .
The man page of
.Mp vfork (2)
in FreeBSD 8.0 states:
.QS
This system call will be eliminated when proper system sharing mechanisms
are implemented. Users should not depend on the memory sharing semantics
of vfork() as it will, in that case, be made synonymous to fork(2).
.QE
.LP
Vixie supports the removal with the note that ``the last
system on which fork was so slow that an mh user would notice it, was
Eunice. that was 1987''.
.[
nmh-workers vixie edginess
.]
I replaced all calls to
.Fu vfork()
with calls to
.Fu fork() .
.Ci 40821f5c1316e9205a08375e7075909cc9968e7d
.P
Related to the costs of
.Fu fork()
is the probability of its success.
In the eighties, on heavy loaded systems, calls to
.Fu fork()
were prone to failure.
Hence, many of the
.Fu fork()
calls in the code were wrapped into loops to retry the
.Fu fork()
several times, to increase the chances to succeed, eventually.
On modern systems, a failing
.Fu fork()
call is unusual.
Hence, in the rare case when
.Fu fork()
fails, mmh programs simply abort.
.Ci 5fbf37ee68e018998ada61eeab73e035b26834b6


.U3 "Header Fields
.BU
The
.Hd Encrypted
header field was introduced by RFC\|822,
but already marked as legacy in RFC\|2822.
Today, OpenPGP provides the basis for standardized exchange of encrypted
messages [RFC\|4880, RFC\|3156].
Hence, the support for
.Hd Encrypted
header fields is removed in mmh.
.Ci 064527f7b57ab050e5af13e15ad99aeeab125857
.BU
The native support for
.Hd Face
header fields has been removed, as well.
.Ci 8e5be81f784682822f5e868c1bf3c8624682bd23
This feature is similar to the
.Hd X-Face
header field in its intent,
but takes a different approach to store the image.
Instead of encoding the image data directly into the header field,
it contains the hostname and UDP port where the image
date can be retrieved.
There is even a third Face system,
which is the successor of
.Hd X-Face ,
although it re-uses the
.Hd Face
header field.
It was invented in 2005 and supports colored PNG images.
None of the Face systems described here is popular today.
Hence, mmh has no direct support for them.
.BU
.Id content-md5
The
.Hd Content-MD5
header field was introduced by RFC\|1864.
It provides detection of data corruption during the transfer.
But it can not ensure verbatim end-to-end delivery of the contents
[RFC\|1864].
The proper approach to verify content integrity in an
end-to-end relationship is the use of digital signatures.
.\" XXX (RFCs FIXME).
On the other hand, transfer protocols should detect corruption during
the transmission.
The TCP includes a checksum field therefore.
These two approaches in combinations render the
.Hd Content-MD5
header field superfluous.
Not a single one out of 4\|200 messages from two decades
in an nmh-workers mailing list archive contains a
.Hd Content-MD5
header field.
Neither did any of the 60\|000 messages in my personal mail storage.
Removing the support for this header field,
removed the last place where MD5 computation was needed.
.Ci 31dc797eb5178970d68962ca8939da3fd9a8efda
Hence, the MD5 code could be removed as well.
Over 500 lines of code vanished by this one change.


.U3 "MMDF maildrop support
.P
This type of format is conceptionally similar to the mbox format,
but uses a different message delimiter (`\fL\\1\\1\\1\\1\fP',
commonly written as `\fL^A^A^A^A\fP', instead of `\fLFrom\0\fP').
Mbox is the de-facto standard maildrop format on Unix,
whereas the MMDF maildrop format is now forgotten.
By dropping the MMDF maildrop format support,
mbox became the only packed mailbox format supported in mmh.
.P
The simplifications within the code were moderate.
Mainly, the reading and writing of MMDF mailbox files was removed.
But also, switches of
.Pn packf
and
.Pn rcvpack
could be removed.
.Ci 3916ab66ad5d183705ac12357621ea8661afd3c0
In the message parsing function
.Fn sbr/m_getfld.c ,
knowledge of MMDF packed mail boxes was removed.
.Ci 684ec30d81e1223a282764452f4902ed4ad1c754
Further code structure simplifications may be possible there,
because only one single packed mailbox format is left to be supported.
I have not worked on them yet because
.Fu m_getfld()
is heavily optimized and thus dangerous to touch.
The risk of damaging the intricate workings of the optimized code is
too high.


.U3 "Prompter's Control Keys
.P
The program
.Pn prompter
queries the user to fill in a message form.
When used by
.Pn comp
as
.Cl "comp -editor prompter" ,
the resulting behavior is similar to
.Pn mailx .
Apparently,
.Pn prompter
had not been touched lately.
Otherwise it's hardly explainable why it
still offered the switches
.Sw -erase
.Ar chr
and
.Sw -kill
.Ar chr
to name the characters for command line editing.
The times when this had been necessary are long time gone.
Today these things work out-of-the-box, and if not, are configured
with the standard tool
.Pn stty .
The switches are removed now
.Ci 0bd9750710cdbab80cfb4036dd87af20afe1552f .


.U3 "Hardcopy Terminal Support
.P
More of a funny anecdote is a check for being connected to a
hardcopy terminal.
It remained in the code until spring 2012, when I finally removed it
.Ci b7764c4a6b71d37918a97594d866258f154017ca .
.P
The check only prevented a pager to be placed between the printing
program (\c
.Pn mhl )
and the terminal.
In nmh, this could have been ensured statically with the
.Sw -nomoreproc
at the command line, too.
In mmh, setting the profile entry
.Pe Pager
or the environment variable
.Ev PAGER
to
.Pn cat
is sufficient.




.H2 "Attachments
.P
The mind model of email attachments is unrelated to MIME.
Although the MIME RFCs (2045 through 2049) define the technical
requirements for having attachments, they do not mention the word
attachment.
Instead of attachments, MIME talks about ``multi-part message bodies''
[RFC\|2045], a more general concept.
Multi-part messages are messages
``in which one or more different
sets of data are combined in a single body''
[RFC\|2046].
MIME keeps its descriptions generic;
it does not imply specific usage models.
One usage model became prevalent: attachments.
The idea is having a main text document with files of arbitrary kind
attached to it.
In MIME terms, this is a multi-part message having a text part first
and parts of arbitrary type following.
.P
MH's MIME support is a direct implementation of the RFCs.
The perception of the topic described in the RFCs is clearly visible
in MH's implementation.
.\" XXX rewrite ``no idea''.
As a result,
MH had all the MIME features but no idea of attachments.
But users do not need all the MIME features,
they want convenient attachment handling.


.U3 "Composing MIME Messages
.P
In order to improve the situation on the message composing side,
Jon Steinhart had added an attachment system to nmh in 2002.
.Ci 7480dbc14bc90f2d872d434205c0784704213252
In the file
.Fn docs/README-ATTACHMENTS ,
he described his motivation to do so as such:
.QS
Although nmh contains the necessary functionality for MIME message
handing [sic!], the interface to this functionality is pretty obtuse.
There's no way that I'm ever going to convince my partner to write
.Pn mhbuild
composition files!
.QE
.LP
With this change, the mind model of attachments entered nmh.
In the same document:
.QS
These changes simplify the task of managing attachments on draft files.
They allow attachments to be added, listed, and deleted.
MIME messages are automatically created when drafts with attachments
are sent.
.QE
.LP
Unfortunately, the attachment system,
like any new facilities in nmh,
was inactive by default.
.P
During my work in Argentina, I tried to improve the attachment system.
But, because of great opposition in the nmh community,
my patch died as a proposal on the mailing list, after long discussions.
.[
nmh-workers attachment proposal
.]
In January 2012, I extended the patch and applied it to mmh.
.Ci 8ff284ff9167eff8f5349481529332d59ed913b1
In mmh, the attachment system is active by default.
Instead of command line switches, the
.Pe Attachment-Header
profile entry is used to specify
the name of the attachment header field.
It is pre-defined to
.Hd Attach .
.P
To add an attachment to a draft, a header line needs to be added:
.VS
To: bob
Subject: The file you wanted
Attach: /path/to/the/file-bob-wanted
--------
Here it is.
VE
The header field can be added to the draft manually in the editor,
or by using the `attach' command at the WhatNow prompt, or
non-interactively with
.Pn anno :
.VS
anno -append -nodate -component Attach -text /path/to/attachment
VE
Drafts with attachment headers are converted to MIME automatically by
.Pn send .
The conversion to MIME is invisible to the user.
The draft stored in the draft folder is always in source form with
attachment headers.
If the MIMEification fails (e.g. because the file to attach
is not accessible) the original draft is not changed.
.P
The attachment system handles the forwarding of messages, too.
If the attachment header value starts with a plus character (`\fL+\fP'),
like in
.Cl "Attach: +bob 30 42" ,
the given messages in the specified folder will be attached.
This allowed to simplify
.Pn forw .
.Ci f41f04cf4ceca7355232cf7413e59afafccc9550
.P
Closely related to attachments is non-ASCII text content,
because it requires MIME too.
In nmh, the user needed to call `mime' at the WhatNow prompt
to have the draft converted to MIME.
This was necessary whenever the draft contained non-ASCII characters.
If the user did not call `mime', a broken message would be sent.
Therefore, the
.Pe automimeproc
profile entry could be specified to have the `mime' command invoked
automatically each time.
Unfortunately, this approach conflicted with the attachment system
because the draft would already be in MIME format at the time
when the attachment system wanted to MIMEify it.
To use nmh's attachment system, `mime' must not be called at the
WhatNow prompt and
.Pe automimeproc
must not be set in the profile.
But then the case of non-ASCII text without attachment headers was
not caught.
All in all, the solution was complex and irritating.
My patch from December 2010
.[
nmh-workers attachment proposal
.]
would have simplified the situation.
.P
Mmh's current solution is even more elaborate.
Any necessary MIMEification is done automatically.
There is no `mime' command at the WhatNow prompt anymore.
The draft will be converted automatically to MIME when either an
attachment header or non-ASCII text is present.
Furthermore, the hash character (`\fL#\fP') is not special any more
at line beginnings in the draft message.
.\" XXX REF ?
Users need not concern themselves with the whole topic at all.
.P
Although the new approach does not anymore support arbitrary MIME
compositions directly, the full power of
.Pn mhbuild
can still be accessed.
Given no attachment headers are included, the user can create
.Pn mhbuild
composition drafts like in nmh.
Then, at the WhatNow prompt, he needs to invoke
.Cl "edit mhbuild
to convert it to MIME.
Because the resulting draft does neither contain non-ASCII characters
nor has it attachment headers, the attachment system will not touch it.
.P
The approach taken in mmh is tailored towards today's most common case:
a text part, possibly with attachments.
This case was simplified.


.U3 "MIME Type Guessing
.P
From the programmer's point of view, the use of
.Pn mhbuild
composition drafts had one notable advantage over attachment headers:
The user provides the appropriate MIME types for files to include.
The attachment system needs to find out the correct MIME type itself.
This is a difficult task, yet it spares the user irritating work.
Determining the correct MIME type of content is partly mechanical,
partly intelligent work.
Forcing the user to find out the correct MIME type,
forces him to do partly mechanical work.
Letting the computer do the work can lead to bad choices for difficult
content.
For mmh, the latter option was chosen.
.P
Determining the MIME type by the suffix of the file name is a dumb
approach, yet it is simple to implement and provides good results
for the common cases.
Mmh implements this approach in the
.Pn print-mimetype
script.
.Ci 4b5944268ea0da7bb30598a27857304758ea9b44
Using it is the default choice.
.P
A far better, though less portable, approach is the use of
.Pn file .
This standard tool tries to determine the type of files.
Unfortunately, its capabilities and accuracy varies from system to system.
Additionally, its output was only intended for human beings,
but not to be used by programs.
It varies much.
Nevertheless, modern versions of GNU
.Pn file ,
which is prevalent on the popular GNU/Linux systems,
provide MIME type output in machine-readable form.
Although this solution is highly system-dependent,
it solves the difficult problem well.
On systems where GNU
.Pn file ,
version 5.04 or higher, is available it should be used.
One needs to specify the following profile entry to do so:
.Ci 3baec236a39c5c89a9bda8dbd988d643a21decc6
.VS
Mime-Type-Query: file -b --mime
VE
.LP
Other versions of
.Pn file
might possibly be usable with wrapper scripts to reformat the output.
The diversity among
.Pn file
implementations is great; one needs to check the local variant.
.P
If no MIME type can be determined, text content gets sent as
`text/plain' and anything else under the generic fall-back type
`application/octet-stream'.
It is not possible in mmh to override the automatic MIME type guessing
for a specific file.
To do so, either the user would need to know in advance for which file
the automatic guessing fails, or the system would require interaction.
I consider both cases impractical.
The existing solution should be sufficient.
If not, the user may always fall back to
.Pn mhbuild
composition drafts and ignore the attachment system.


.U3 "Storing Attachments
.P
Extracting MIME parts of a message and storing them to disk is performed by
.Pn mhstore .
The program has two operation modes,
.Sw -auto
and
.Sw -noauto .
With the former one, each part is stored under the filename given in the
MIME part's meta information, if available.
This naming information is usually available for modern attachments.
If no filename is available, this MIME part is stored as if
.Sw -noauto
would have been specified.
In the
.Sw -noauto
mode, the parts are processed according to rules, defined by
.Pe mhstore-store-*
profile entries.
These rules define generic filename templates for storing
or commands to post-process the contents in arbitrary ways.
If no matching rule is available the part is stored under a generic
filename, built from message number, MIME part number, and MIME type.
.P
The
.Sw -noauto
mode had been the default in nmh because it was considered safe,
in contrast to the
.Sw -auto
mode.
In mmh,
.Sw -auto
is not dangerous anymore.
Two changes were necessary:
.LI 1
Any directory path is removed from the proposed filename.
Thus, the files are always stored in the expected directory.
.Ci 41b6eadbcecf63c9a66aa5e582011987494abefb
.LI 2
Tar files are not extracted automatically any more.
Thus, the rest of the file system will not be touched.
.Ci 94c80042eae3383c812d9552089953f9846b1bb6
.LP
Now, the outcome of mmh's
.Cl "mhstore -auto
can be foreseen from the output of
.Cl "mhlist -verbose" .
.P
The
.Sw -noauto
mode is seen to be more powerful but less convenient.
On the other hand,
.Sw -auto
is safe now and
storing attachments under their original name is intuitive.
Hence,
.Sw -auto
serves better as the default option.
.Ci 3410b680416c49a7617491af38bc1929855a331d
.P
Files are stored into the directory given by the
.Pe Nmh-Storage
profile entry, if set, or
into the current working directory, otherwise.
Storing to different directories is only possible with
.Pe mhstore-store-*
profile entries.
.P
Still, in both modes, existing files get overwritten silently.
This can be considered a bug.
Yet, each other behavior has its draw-backs, too.
Refusing to replace files requires adding a
.Sw -force
option.
Users will likely need to invoke
.Pn mhstore
a second time with
.Sw -force .
Eventually, only the user can decide in the specific case.
This requires interaction, which I like to avoid if possible.
Appending a unique suffix to the filename is another bad option.
For now, the behavior remains as it is.
.P
In mmh, only MIME parts of type message are special in
.Pn mhstore 's
.Sw -auto
mode.
Instead of storing message/rfc822 parts as files to disk,
they are stored as messages into the current mail folder.
The same applies to message/partial, although the parts are
automatically reassembled beforehand.
MIME parts of type message/external-body are not automatically retrieved
anymore.
Instead, information on how to retrieve them is output.
Not supporting this rare case saved nearly one thousand lines of code.
.Ci 55e1d8c654ee0f7c45b9361ce34617983b454c32
.\" XXX mention somewhere else too: (The profile entry `nmh-access-ftp'
.\"     and sbr/ruserpass.c for reading ~/.netrc are gone now.)
`application/octet-stream; type=tar' is not special anymore.
Automatically extracting such MIME parts had been the dangerous part
of the
.Sw -auto
mode.
.Ci 94c80042eae3383c812d9552089953f9846b1bb6



.U3 "Showing MIME Messages
.P
The program
.Pn mhshow
had been written to display MIME messages.
It implemented the conceptional view of the MIME RFCs.
Nmh's
.Pn mhshow
handled each MIME part independently, presenting them separately
to the user.
This does not match today's understanding of email attachments,
where displaying a message is seen to be a single, integrated operation.
Today, email messages are expected to consist of a main text part
plus possibly attachments.
They are not any more seen to be arbitrary MIME hierarchies with
information on how to display the individual parts.
I adjusted
.Pn mhshow 's
behavior to the modern view on the topic.
.P
One should note that this section completely ignores the original
.Pn show
program, because it was not capable to display MIME messages
and is no longer part of mmh.
.\" XXX ref to other section
Although
.Pn mhshow
was renamed to
.Pn show
in mmh, this section uses the name
.Pn mhshow ,
in order to avoid confusion.
.P
In mmh, the basic idea is that
.Pn mhshow
should display a message in one single pager session.
Therefore,
.Pn mhshow
invokes a pager session for all its output,
whenever it prints to a terminal.
.Ci a4197ea6ffc5c1550e8b52d5a654bcaaaee04a4e
In consequence,
.Pn mhl
does no more invoke a pager.
.Ci 0e46503be3c855bddaeae3843e1b659279c35d70
With
.Pn mhshow
replacing the original
.Pn show ,
output from
.Pn mhl
does not go to the terminal directly, but through
.Pn mhshow .
Hence,
.Pn mhl
does not need to invoke a pager.
The one and only job of
.Pn mhl
is to format messages or parts of them.
The only place in mmh, where a pager is invoked is
.Pn mhshow .
.P
.Pe mhshow-show-*
profile entries can be used to display MIME parts in a specific way.
For instance, PDF and Postscript files could be converted to plain text
to display them in the terminal.
In mmh, MIME parts will always be displayed serially.
The request to display the MIME type `multipart/parallel' in parallel
is ignored.
It is simply treated as `multipart/mixed'.
.Ci d0581ba306a7299113a346f9b4c46ce97bc4cef6
This could already be requested with the, now removed,
.Sw -serialonly
switch of
.Pn mhshow .
As MIME parts are always processed exclusively, i.e. serially,
the `%e' escape in
.Pe mhshow-show-*
profile entries became useless and was thus removed.
.Ci a20d405db09b7ccca74d3e8c57550883da49e1ae
.P
In the intended setup, only text content would be displayed.
Non-text content would be converted to text by appropriate
.Pe mhshow-show-*
profile entries before, if possible and wanted.
All output would be displayed in a single pager session.
Other kinds of attachments are ignored.
With
.Pe mhshow-show-*
profile entries for them, they can be displayed serially along
the message.
For parallel display, the attachments need to be stored to disk first.
.P
To display text content in foreign charsets, they need to be converted
to the native charset.
Therefore,
.Pe mhshow-charset-*
profile entries used to be needed.
In mmh, the conversion is performed automatically by piping the
text through the
.Pn iconv
command, if necessary.
.Ci 2433122c20baccb10b70b49c04c6b0497b5b3b60
Custom
.Pe mhshow-show-*
rules for textual content might need a
.Cl "iconv -f %c %f |
prefix to have the text converted to the native charset.
.P
Although the conversion of foreign charsets to the native one
has improved, it is not consistent enough.
Further work needs to be done and
the basic concepts in this field need to be re-thought.
Though, the default setup of mmh displays message in foreign charsets
correctly without the need to configure anything.


.ig

.P
mhshow/mhstore: Removed support for retrieving message/external-body parts.
These tools will not download the contents automatically anymore. Instead,
they print the information needed to get the contents. If someone should
really receive one of those rare message/external-body messages, he can
do the job manually. We save nearly a thousand lines of code. That's worth
it!
(The profile entry `nmh-access-ftp' and sbr/ruserpass.c for reading
~/.netrc are gone now.)
.Ci 55e1d8c654ee0f7c45b9361ce34617983b454c32

..



.H2 "Signing and Encrypting
.P
Nmh offers no direct support for digital signatures and message encryption.
This functionality needed to be added through third-party software.
In mmh, the functionality should be included because it
is a part of modern email and likely wanted by users of mmh.
A fresh mmh installation should support signing and encrypting
out-of-the-box.
Therefore, Neil Rickert's
.Pn mhsign
and
.Pn mhpgp
scripts
.[
neil rickert mhsign mhpgp
.]
were included into mmh
.Ci f45cdc98117a84f071759462c7ae212f4bc5ab2e
.Ci 58cf09aa36e9f7f352a127158bbf1c5678bc6ed8 .
The scripts fit well because they are lightweight and
similar of style to the existing tools.
Additionally, no licensing difficulties appeared,
as they are part of the public domain.
.P
.Pn mhsign
handles the signing and encrypting part.
It comprises about 250 lines of shell code and interfaces between
.Pn gnupg
and
the MH system.
It was meant to be invoked manually at the WhatNow prompt, but in mmh,
.Pn send
invokes
.pn mhsign
automatically
.Ci c7b5e1df086bcc37ff40163ee67571f076cf6683 .
Special header fields were introduced to request this action.
If a draft contains the
.Hd Sign
header field,
.Pn send
will initiate the signing.
The signing key is either chosen automatically or specified by the
.Pe Pgpkey
profile entry.
.Pn send
always create signatures using the PGP/MIME standard, \" REF XXX
but by manually invoking
.Pn mhsign ,
old-style non-MIME signatures can be created as well.
To encrypt an outgoing message, the draft needs to contain an
.Hd Enc
header field.
Public keys of all recipients are searched for in the gnupg keyring and
in a file called
.Fn pgpkeys ,
which contains exceptions and overrides.
Unless public keys are found for all recipients,
.Pn mhsign
will refuse to encrypt it.
Currently, messages with hidden (BCC) recipients can not be encrypted.
This work is pending because it requires a structurally more complex
approach.
.P
.Pn mhpgp
is the companion to
.Pn mhsign .
It verifies signatures and decrypts messages.
Encrypted messages can either be temporarily decrypted for display
or permanently decrypted and stored into the current folder.
Currently,
.Pn mhpgp
needs to be invoked manually.
The integration into
.Pn show
and
.Pn mhstore
to verify signatures and decrypt messages as needs
is planned but not realized yet.
.P
Both scripts were written for nmh, hence they needed to be adjust
according to the differences between nmh and mmh.
For instance, they use the backup prefix no longer.
Furthermore, compatibility support for old PGP features was dropped.
.P
The integrated message signing and encrypting support is one of the
most recent features in mmh.
It has not yet had the time to mature.
User feedback and personal experience need to be accumulated to
direct the further development of the facility.
Although the feedback and experience is still missing,
it seems to be worthwhile to consider adding
.Sw -[no]sign
and
.Sw -[no]enc
switches to
.Pn send ,
to be able to override the corresponding header fields.
A profile entry:
.VS
send: -sign
VE
would then activate signing for all outgoing messages.
With the present approach, a
.Hd Send
header component needs to be added to each draft template
to achieve the same result.
Adding the switches would ease the work greatly and keep the
template files clean.




.H2 "Draft and Trash Folder
.P

.U3 "Draft Folder
.Id draft-folder
.P
In the beginning, MH had the concept of a draft message.
This is the file
.Fn draft
in the MH directory, which is treated special.
On composing a message, this draft file was used.
When starting to compose another message before the former one was sent,
the user had to decide among:
.LI 1
Using the old draft to finish and send it before starting with a new one.
.LI 2
Discarding the old draft and replacing it with a new one.
.LI 3
Preserving the old draft by refiling it to a folder.
.LP
It was only possible to work in alternation on multiple drafts.
Therefore, the current draft needed to be refiled to a folder and
another one re-used for editing.
Working on multiple drafts at the same time was impossible.
The usual approach of switching to a different MH context did not
help anything.
.P
The draft folder facility exists to
allow true parallel editing of drafts, in a straight forward way.
It was introduced by Marshall T. Rose, already in 1984.
Similar to other new features, the draft folder was inactive by default.
Even in nmh, the highly useful draft folder was not available
out-of-the-box.
At least, Richard Coleman added the man page
.Mp mh-draft (5)
to better document the feature.
.P
Not using the draft folder facility has the single advantage of having
the draft file at a static location.
This is simple in simple cases but the concept does not scale for more
complex cases.
The concept of the draft message is too limited for the problem.
Therefore the draft folder was introduced.
It is the more powerful and more natural concept.
The draft folder is a folder like any other folder in MH.
Its messages can be listed like any other messages.
A draft message is no longer a special case.
Tools do not need special switches to work on the draft message.
Hence corner cases were removed.
.P
The trivial part of the work was activating the draft folder with a
default name.
I chose the name
.Fn +drafts
for obvious reasons.
In consequence, the command line switches
.Sw -draftfolder
and
.Sw -draftmessage
could be removed.
More difficult but also more improving was updating the tools to the
new concept.
For nearly three decades, the tools needed to support two draft handling
approaches.
By fully switching to the draft folder, the tools could be simplified
by dropping the awkward draft message handling code.
.Sw -draft
switches were removed because operating on a draft message is no longer
special.
It became indistinguishable to operating on any other message.
.Ci 337338b404931f06f0db2119c9e145e8ca5a9860
.P
There is no more need to query the user for draft handling
.Ci 2d48b455c303a807041c35e4248955f8bec59eeb .
It is always possible to add another new draft.
Refiling drafts is without difference to refiling other messages.
All of these special cases are gone.
Yet, one draft-related switch remained.
.Pn comp
still has
.Sw -[no]use
for switching between two modes:
.LI 1
.Sw -use
to modify an existing draft.
.LI 2
.Sw -nouse
to compose a new draft, possibly taking some existing message as template.
.LP
In either case, the behavior of
.Pn comp
is deterministic.
.P
.Pn send
now operates on the current message in the draft folder by default.
As message and folder can both be overridden by specifying them on
the command line, it is possible to send any message in the mail storage
by simply specifying its number and folder.
In contrast to the other tools,
.Pn send
takes the draft folder as its default folder.
.P
Dropping the draft message concept in favor for the draft folder concept,
removed special cases with regular cases.
This simplified the source code of the tools, as well as the concepts.
In mmh, draft management does not break with the MH concepts
but applies them.
.Cl "scan +drafts" ,
for instance, is a truly natural request.
Most of the work was already performed by Rose in the eighties.
The original improvement of mmh is dropping the old draft message approach
and thus simplifying the tools, the documentation and the system as a whole.
Although my part in the draft handling improvement was small,
it was an important one.


.U3 "Trash Folder
.Id trash-folder
.P
Similar to the situation for drafts is the situation for removed messages.
Historically, a message was ``deleted'' by prepending a specific
\fIbackup prefix\fP, usually the comma character,
to the file name.
The specific file would then be ignored by MH because only files with
names consisting of digits only are treated as messages.
Although files remained in the file system,
the messages were no longer visible in MH.
To truly delete them, a maintenance job was needed.
Usually a cron job was installed to delete them after a grace time.
For instance:
.VS
find $HOME/Mail -type f -name ',*' -ctime +7 -delete
VE
In such a setup, the original message could be restored
within the grace time interval by stripping the
backup prefix from the file name.
But the user could not rely on this statement.
If the last message of a folder with six messages (\fL1-6\fP) was removed,
message
.Fn 6 ,
became file
.Fn ,6 .
If then a new message entered the same folder, it would be named with
the number one above the highest existing message number.
In this case the message would be named
.Fn 6
then.
If this new message would be removed as well,
then the backup of the former message is overwritten.
Hence, the ability to restore removed messages did not only depend on
the sweeping cron job but also on the removing of further messages.
It is undesirable to have such obscure and complex mechanisms.
The user should be given a small set of clear assertions, such as
``Removed files are restorable within a seven-day grace time.''
With the addition ``... unless a message with the same name in the
same folder is removed before.'' the statement becomes complex.
A user will hardly be able to keep track of any removal to know
if the assertion still holds true for a specific file.
In practice, the real mechanism is unclear to the user.
The consequences of further removals are not obvious.
.P
Furthermore, the backup files are scattered within the whole mail storage.
This complicates managing them.
It is possible with the help of
.Pn find ,
but everything would be more convenient
if the deleted messages would be collected in one place.
.P
The profile entry
.Pe rmmproc
(previously named
.Pe Delete-Prog )
was introduced very early to improve the situation.
It could be set to any command, which would be executed to remove
the specified messages.
This would override the default action described above.
Refiling the to-be-removed files to a trash folder is the usual example.
Nmh's man page
.Mp rmm (1)
proposes to set the
.Pe rmmproc
to
.Cl "refile +d
to move messages to the trash folder,
.Fn +d ,
instead of renaming them with the backup prefix.
The man page proposes additionally the expunge command
.Cl "rm `mhpath +d all`
to empty the trash folder.
.P
Removing messages in such a way has advantages.
The mail storage is prevented from being cluttered with removed messages
because they are all collected in one place.
Existing and removed messages are thus separated more strictly.
No backup files are silently overwritten.
But most important is the ability to keep removed messages in the MH domain.
Messages in the trash folder can be listed like those in any other folder.
Deleted messages can be displayed like any other messages.
.Pn refile
can restore deleted messages.
All operations on deleted files are still covered by the MH tools.
The trash folder is just like any other folder in the mail storage.
.P
Similar to the draft folder case, I dropped the old backup prefix approach
in favor for replacing it by the better suiting trash folder system.
Hence,
.Pn rmm
calls
.Pn refile
to move the to-be-removed message to the trash folder,
.Fn +trash
by default.
To sweep it clean, the user can use
.Cl "rmm -unlink +trash a" ,
where the
.Sw -unlink
switch causes the files to be unlinked.
.Ci 8edc5aaf86f9f77124664f6801bc6c6cdf258173
.Ci ca0b3e830b86700d9e5e31b1784de2bdcaf58fc5
.P
Dropping the legacy approach and converting to the new approach completely
simplified the code base.
The relationship between
.Pn rmm
and
.Pn refile
was inverted.
In mmh,
.Pn rmm
invokes
.Pn refile ,
which used to be the other way round.
Yet, the relationship is simpler now.
Loops, like described in nmh's man page for
.Mp refile (1),
can no longer occur:
.QS
Since
.Pn refile
uses your
.Pe rmmproc
to delete the message, the
.Pe rmmproc
must NOT call
.Pn refile
without specifying
.Sw -normmproc
or you will create an infinite loop.
.QE
.LP
.Pn rmm
either unlinks a message with
.Fu unlink()
or invokes
.Pn refile
to move it to the trash folder.
.Pn refile
does not invoke any tools.
.P
By generalizing the message removal in the way that it became covered
by the MH concepts made the whole system more powerful.





.H2 "Modern Defaults
.P
Nmh has a bunch of convenience-improving features inactive by default,
although one can expect every new user wanting to have them active.
The reason they are inactive by default is the wish to stay compatible
with old versions.
But what is the definition for old versions?
Still, the highly useful draft folder facility has not been activated
by default although it was introduced over twenty-five years ago.
.[
rose romine real work
.]
The community seems not to care.
This is one of several examples that require new users to first build up
a profile before they can access the modern features of nmh.
Without an extensive profile, the setup is hardly usable
for modern emailing.
The point is not the customization of the setup,
but the need to activate generally useful facilities.
.P
Yet, the real problem lies less in enabling the features, as this is
straight forward as soon as one knows what he wants.
The real problem is that new users need deep insight into the project
to find out about inactive features nmh already provides.
To give an example, I needed one year of using nmh
before I became aware of the existence of the attachment system.
One could argue that this fact disqualifies my reading of the
documentation.
If I would have installed nmh from source back then, I could agree.
Yet, I had used a prepackaged version and had expected that it would
just work.
Nevertheless, I had been convinced by the concepts of MH already
and I am a software developer,
still I required a lot of time to discover the cool features.
How can we expect users to be even more advanced than me,
just to allow them use MH in a convenient and modern way?
Unless they are strongly convinced of the concepts, they will fail.
I have seen friends of me giving up disappointed
before they truly used the system,
although they had been motivated in the beginning.
They suffer hard enough to get used to the tool chest approach,
we developers should spare them further inconveniences.
.P
Maintaining compatibility for its own sake is bad,
because the code base collects more and more compatibility code.
Sticking to the compatiblity code means remaining limited;
whereas adjusting to the changes renders the compatibility unnecessary.
Keeping unused alternatives in the code is a bad choice as they likely
gather bugs, by not being well tested.
Also, the increased code size and the greater number of conditions
increase the maintenance costs.
If any MH implementation would be the back-end of widespread
email clients with large user bases, compatibility would be more
important.
Yet, it appears as if this is not the case.
Hence, compatibility is hardly important for technical reasons.
Its importance originates rather from personal reasons.
Nmh's user base is small and old.
Changing the interfaces would cause inconvenience to long-term users of MH.
It would force them to change their many years old MH configurations.
I do understand this aspect, but by sticking to the old users,
new users are kept away.
Yet, the future lies in new users.
In consequence, mmh invites new users by providing a convenient
and modern setup, readily usable out-of-the-box.
.P
In mmh, all modern features are active by default and many previous
approaches are removed or only accessible in manual ways.
New default features include:
.BU
The attachment system (\c
.Hd Attach ).
.Ci 8ff284ff9167eff8f5349481529332d59ed913b1
.BU
The draft folder facility (\c
.Fn +drafts ).
.Ci 337338b404931f06f0db2119c9e145e8ca5a9860
.BU
The unseen sequence (`u')
.Ci c2360569e1d8d3678e294eb7c1354cb8bf7501c1
and the sequence negation prefix (`!').
.Ci db74c2bd004b2dc9bf8086a6d8bf773ac051f3cc
.BU
Quoting the original message in the reply.
.Ci 67411b1f95d6ec987b4c732459e1ba8a8ac192c6
.BU
Forwarding messages using MIME.
.Ci 6e271608b7b9c23771523f88d23a4d3593010cf1
.LP
In consequence, a setup with a profile that defines only the path to the
mail storage, is already convenient to use.
Again, Paul Vixie's ``edginess'' call supports the direction I took:
``the `main branch' should just be modern''.
.[
paul vixie edginess nmh-workers
.]





.\" --------------------------------------------------------------
.H1 "Styling
.P
Kernighan and Pike have emphasized the importance of style in the
preface of their book:
.[ [
kernighan pike practice of programming
.], p. x]
.QS
Chapter 1 discusses programming style.
Good style is so important to good programming that we have chose
to cover it first.
.QE
This section covers changes in mmh that were guided by the desire
to improve on style.
Many of them follow the rules given in the quoted book.
.[
kernighan pike practice of programming
.]




.H2 "Code Style
.Id code-style
.P
.U3 "Indentation Style
.P
Indentation styles are the holy cow of programmers.
Kernighan and Pike
.[ [
kernighan pike practice of programming
.], p. 10]
wrote:
.QS
Programmers have always argued about the layout of programs,
but the specific style is much less important than its consistent
application.
Pick one style, preferably ours, use it consistently, and don't waste
time arguing.
.QE
.P
I agree that the constant application is most important,
but I believe that some styles have advantages over others.
For instance the indentation with tab characters only.
Tab characters directly map to the nesting level \(en
one tab, one level.
Tab characters are flexible because developers can adjust them to
whatever width they like to have.
There is no more need to run
.Pn unexpand
or
.Pn entab
programs to ensure the correct mixture of leading tabs and spaces.
The simple rules are: (1) Leading whitespace must consist of tabs only.
(2) Any other whitespace should consist of spaces.
These two rules ensure the integrity of the visual appearance.
Although reformatting existing code should be avoided, I did it.
I did not waste time arguing; I just reformated the code.
.Ci a485ed478abbd599d8c9aab48934e7a26733ecb1

.U3 "Comments
.P
Section 1.6 of
.[ [
kernighan pike practice of programming
.], p. 23]
demands: ``Don't belabor the obvious.''
Hence, I simply removed all the comments in the following code excerpt:
.VS
context_replace(curfolder, folder);  /* update current folder  */
seq_setcur(mp, mp->lowsel);  /* update current message */
seq_save(mp);  /* synchronize message sequences */
folder_free(mp);  /* free folder/message structure */
context_save();  /* save the context file */

[...]

int c;  /* current character */
char *cp;  /* miscellaneous character pointer */

[...]

/* NUL-terminate the field */
*cp = '\0';
VE
.Ci 426543622b377fc5d091455cba685e114b6df674
.P
The program code explains enough itself, already.


.U3 "Names
.P
Kernighan and Pike suggest:
``Use active names for functions''.
.[ [
kernighan pike practice of programming
.], p. 4]
One application of this rule was the rename of
.Fu check_charset()
to
.Fu is_native_charset() .
.Ci 8d77b48284c58c135a6b2787e721597346ab056d
The same change fixed a violation of ``Be accurate''
.[ [
kernighan pike practice of programming
.], p. 4]
as well.
The code did not match the expectation the function suggested,
as it, for whatever reason, only compared the first ten characters
of the charset name.
.P
More important than using active names is using descriptive names.
.VS
m_unknown(in);  /* the MAGIC invocation... */
VE
Renaming the obscure
.Fu m_unknown()
function was a delightful event, although it made the code less funny.
.Ci 611d68d19204d7cbf5bd585391249cb5bafca846
.P
Magic numbers are generally considered bad style.
Obviously, Kernighan and Pike agree:
``Give names to magic numbers''.
.[ [
kernighan pike practice of programming
.], p. 19]
One such change was naming the type of input \(en mbox or mail folder \(en
to be scanned:
.VS
#define SCN_MBOX (-1)
#define SCN_FOLD 0
VE
.Ci 7ffb36d28e517a6f3a10272056fc127592ab1c19
.P
The argument
.Ar outnum
of the function
.Fu scan()
in
.Fn uip/scansbr.c
defines the number of the message to be created.
If no message is to be created, the argument is misused to transport
program logic.
This lead to obscure code.
I improved the clarity of the code by introducing two variables:
.VS
int incing = (outnum > 0);
int ismbox = (outnum != 0);
VE
They cover the magic values and are used for conditions.
The variable
.Ar outnum
is only used when it holds an ordinary message number.
.Ci b8b075c77be7794f3ae9ff0e8cedb12b48fd139f
The clarity improvement of the change showed detours in the program logic
of related code parts.
Having the new variables with descriptive names, a more
straight forward implementation became apparent.
Before the code was clarified, the possibility to improve had not be seen.
.Ci aa60b0ab5e804f8befa890c0a6df0e3143ce0723



.H2 "Structural Rework
.P
Although the stylistic changes described up to here improve the
readability of the source code, all of them are changes ``in the small''.
Structural changes affect a much larger area.
They are more difficult to do but lead to larger improvements,
especially as they influence the outer shape of the tools as well.
.P
At the end of their chapter on style,
Kernighan and Pike ask: ``But why worry about style?''
.[ [
kernighan pike practice of programming
.], p. 28]
Following are two examples of structural rework that show
why style is important in the first place.


.U3 "Rework of \f(CWanno\fP
.P
Until 2002,
.Pn anno
had six functional command line switches,
.Sw -component
and
.Sw -text ,
which have an argument each,
and the two pairs of flags,
.Sw -[no]date
and
.Sw -[no]inplace .
Then Jon Steinhart introduced his attachment system.
In need for more advanced annotation handling, he extended
.Pn anno .
He added five more switches:
.Sw -draft ,
.Sw -list ,
.Sw -delete ,
.Sw -append ,
and
.Sw -number ,
the last one taking an argument.
.Ci 7480dbc14bc90f2d872d434205c0784704213252
Later,
.Sw -[no]preserve
was added.
.Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f
Then, the Synopsis section of the man page
.Mp anno (1)
read:
.VS
anno [+folder] [msgs] [-component field] [-inplace | -noinplace]
	[-date | -nodate] [-draft] [-append] [-list] [-delete]
	[-number [num|all]] [-preserve | -nopreserve] [-version]
	[-help] [-text body]
VE
.LP
The implementation followed the same structure.
Problems became visible when
.Cl "anno -list -number 42
worked on the current message instead on message number 42,
and
.Cl "anno -list -number l:5
did not work on the last five messages but failed with the mysterious
error message: ``anno: missing argument to -list''.
Yet, the invocation matched the specification in the man page.
There, the correct use of
.Sw -number
was defined as being
.Cl "[-number [num|all]]
and the textual description for the combination with
.Sw -list
read:
.QS
The
.Sw -list
option produces a listing of the field bodies for
header fields with names matching the specified component,
one per line. The listing is numbered, starting at 1, if the
.Sw -number
option is also used.
.QE
.LP
The problem was manifold.
The code required a numeric argument to the
.Sw -number
switch.
If it was missing or non-numeric,
.Pn anno
aborted with an error message that had an off-by-one error,
printing the switch one before the failing one.
Semantically, the argument to the
.Sw -number
switch is only necessary in combination with
.Sw -delete ,
but not with
.Sw -list .
.P
Trying to fix these problems on the surface would not have solved
them truly, as they originate from a discrepance between the
structure of the problem and the structure implemented in the program.
Such structural differences can not be cured on the surface.
They need to be solved by adjusting the structure of the implementation
to the structure of the problem.
.P
In 2002, the new switches
.Sw -list
and
.Sw -delete
were added in the same way, the
.Sw -number
switch for instance had been added.
Yet, they are of structural different type.
Semantically,
.Sw -list
and
.Sw -delete
introduce modes of operation.
Historically,
.Pn anno
had only one operation mode: adding header fields.
With the extension it got two more modes:
.\" XXX got
listing and deleting header fields.
The structure of the code changes did not pay respect to this
fundamental change to
.Pn anno 's
behavior.
Neither the implementation nor the documentation did clearly
define them as being exclusive modes of operation.
Having identified the problem, I solved it by putting structure into
.Pn anno
and its documentation.
.Ci d54c8db8bdf01e8381890f7729bc0ef4a055ea11
.P
The difference is visible in both the code and the documentation.
The following code excerpt:
.VS
int delete = -2;  /* delete header element if set */
int list = 0;  /* list header elements if set */
[...]
	case DELETESW:  /* delete annotations */
		delete = 0;
		continue;
	case LISTSW:  /* produce a listing */
		list = 1;
		continue;
VE
.LP
was replaced by:
.VS
static enum { MODE_ADD, MODE_DEL, MODE_LIST } mode = MODE_ADD;
[...]
	case DELETESW:  /* delete annotations */
		mode = MODE_DEL;
		continue;
	case LISTSW:  /* produce a listing */
		mode = MODE_LIST;
		continue;
VE
.LP
The replacement code does not only reflect the problem's structure better,
it is easier to understand as well.
The same applies to the documentation.
The man page was completely reorganized to propagate the same structure.
This is visible in the Synopsis section:
.VS
anno [+folder] [msgs] [-component field] [-text body]
	[-append] [-date | -nodate] [-preserve | -nopreserve]
	[-Version] [-help]

anno -delete [+folder] [msgs] [-component field] [-text
	body] [-number num | all ] [-preserve | -nopreserve]
	[-Version] [-help]

anno -list [+folder] [msgs] [-component field] [-number]
	[-Version] [-help]
VE
.\" XXX think about explaining the -preserve rework?



.U3 "Path Conversion
.P
Four kinds of path names can appear in MH:
.LI 1
Absolute Unix directory paths, like
.Fn /etc/passwd .
.LI 2
Relative Unix directory paths, like
.Fn ./foo/bar .
.LI 3
Absolute MH folder paths, like
.Fn +friends/phil .
.LI 4
Relative MH folder paths, like
.Fn @subfolder .
.LP
The last type, relative MH folder paths, are hardly documented.
Nonetheless, they are useful for large mail storages.
The current mail folder is specified as `\c
.Fn @ ',
just like the current directory is specified as `\c
.Fn . '.
.P
To allow MH tools to understand all four notations,
they need to convert between them.
.\" XXX between?
In nmh, these path name conversion functions were located in the files
.Fn sbr/path.c
(``return a pathname'') and
.Fn sbr/m_maildir.c
(``get the path for the mail directory'').
The seven functions in the two files were documented with no more
than two comments, which described obvious information.
The function signatures were neither explaining:
.VS
char *path(char *, int);
char *pluspath(char *);
char *m_mailpath(char *);
char *m_maildir(char *);
VE
.P
My investigation provides the following description:
.LI 1
The second parameter of
.Fu path()
defines the type of path given as first parameter.
Directory paths are converted to absolute directory paths.
Folder paths are converted to absolute folder paths.
Folder paths must not include a leading `\fL@\fP' character.
Leading plus characters are preserved.
The result is a pointer to newly allocated memory.
.LI 2
.Fu pluspath()
is a convenience-wrapper to
.Fu path() ,
to convert folder paths only.
This function can not be used for directory paths.
An empty string parameter causes a buffer overflow.
.LI 3
.Fu m_mailpath()
converts directory paths to absolute directory paths.
The characters `\fL+\fP' or `\fL@\fP' at the beginning of the path name are
treated literal, i.e. as the first character of a relative directory path.
Hence, this function can not be used for folder paths.
In any case, the result is an absolute directory path.
The result is a pointer to newly allocated memory.
.LI 4
.Fu m_maildir()
returns the parameter unchanged if it is an absolute directory path
or begins with the entry `\fL.\fP' or `\fL..\fP'.
All other strings are prepended with the current working directory.
Hence, this functions can not be used for folder paths.
The result is either an absolute directory path or a relative
directory path, starting with a dot.
In contrast to the other functions, the result is a pointer to
static memory.
.P
The situation was obscure, irritating, error-prone, and non-orthogonal.
No clear terminology was used to name the different kinds of path names.
The first argument of
.Fu m_mailpath() ,
for instance, was named
.Ar folder ,
though
.Fu m_mailpath()
can not be used for MH folders.
.P
I reworked the path name conversion completely, introducing clarity.
First of all, the terminology needed to be defined.
A path name is either in the Unix domain, then it is called
\fIdirectory path\fP, `dirpath' for short, or it is in the MH domain,
then it is called \fIfolder path\fP, `folpath' for short.
The two terms need to be used with strict distinction.
Having a clear terminology is often an indicator of having understood
the problem itself.
Second, I exploited the concept of path type indicators.
By requesting every path name to start with a clear type identifier,
conversion between the types can be fully automated.
Thus the tools can accept paths of any type from the user.
Therefore, it was necessary to require relative directory paths to be
prefixed with a dot character.
In consequence, the dot character could no longer be an alias for the
current message.
.Ci cff0e16925e7edbd25b8b9d6d4fbdf03e0e60c01
Third, I created three new functions to replace the previous mess:
.LI 1
.Fu expandfol()
converts folder paths to absolute folder paths,
without the leading plus character.
Directory paths are simply passed through.
This function is to be used for folder paths only, thus the name.
The result is a pointer to static memory.
.LI 2
.Fu expanddir()
converts directory paths to absolute directory paths.
Folder paths are treated as relative directory paths.
This function is to be used for directory paths only, thus the name.
The result is a pointer to static memory.
.LI 3
.Fu toabsdir()
converts any type of path to an absolute directory path.
This is the function of choice for path conversion.
Absolute directory paths are the most general representation of a
path name.
The result is a pointer to static memory.
.P
.\" XXX ueberfluessig?
The new functions have names that indicate their use.
Two of the functions convert relative to absolute path names of the
same type.
The third function converts any path name type to the most general one,
the absolute directory path.
All of the functions return pointers to static memory.
All three functions are implemented in
.Fn sbr/path.c .
.Fn sbr/m_maildir.c
is removed.
.Ci d39e2c447b0d163a5a63f480b23d06edb7a73aa0
.P
Along with the path conversion rework, I also replaced
.Fu getfolder(FDEF)
with
.Fu getdeffol()
and
.Fu getfolder(FCUR)
with
.Fu getcurfol() ,
which is only a convenience wrapper for
.Fu expandfol("@") .
This code was moved from
.Fn sbr/getfolder.c
to
.Fn sbr/path.c .
.Ci d39e2c447b0d163a5a63f480b23d06edb7a73aa0
.P
The related function
.Fu etcpath()
was moved to
.Fn sbr/path.c ,
too
.Ci b4c29794c12099556151d93a860ee51badae2e35 .
Previously, it had been located in
.Fn config/config.c ,
for whatever reasons.
.P
.Fn sbr/path.c
now contains all path handling code.
.\" XXX naechste zeile weg?
Only 173 lines of code were needed to replace the previous 252 lines.
The readability of the code is highly improved.
Additionally, each of the six exported and one static functions
is introduced by an explaining comment.




.H2 "Profile Reading
.P
The MH profile contains the configuration for the user-specific MH setup.
MH tools read the profile right after starting up,
as it contains the location of the user's mail storage
and similar settings that influence the whole setup.
Furthermore, the profile contains the default switches for the tools,
hence, it must be read before the command line switches are processed.
.P
For historic reasons, some MH tools did not read the profile and context.
Among them were
.Pn post /\c
.Pn spost ,
.Pn mhmail ,
and
.Pn slocal .
The reason why these tools ignored the profile were not clearly stated.
During the discussion on the nmh-workers mailing list,
David Levine posted an explanation, quoting John Romine:
.[
nmh-workers levine post profile
.]
.QS
I asked John Romine and here's what he had to say, which
agrees and provides an example that convinces me:
.QS
My take on this is that
.Pn post
should not be called by users directly, and it doesn't read the
.Fn .mh_profile
(only front-end UI programs read the profile).
.QP
For example, there can be contexts where
.Pn post
is called by a helper program (like `\c
.Pn mhmail ')
which may be run by a non-MH user.
We don't want this to prompt the user to create an MH profile, etc.
.QP
My suggestion would be to have
.Pn send
pass a (hidden) `\c
.Sw -fileproc
.Ar proc '
option to
.Pn post
if needed.
You could also
use an environment variable (I think
.Pn send /\c
.Pn whatnow
do this).
.QE
I think that's the way to go.
My personal preference is to use a command line option,
not an environment variable.
.QE
.P
To solve the problem of
.Pn post
not honoring the
.Pe fileproc
profile entry,
the community roughly agreed that a switch
.Sw -fileproc
should be added to
.Pn post
to be able to pass a different fileproc.
I strongly disagree with this approach because it does not solve
the problem; it only removes a single symptom.
The problem is that
.Pn post
does not behave as expected.
But all programs should behave as expected.
Clear and simple concepts are a precondition for this.
Hence, the real solution is having all MH tools read the profile.
.P
The problem has a further aspect.
It mainly originates in
.Pn mhmail .
.Pn mhmail
was intended to be a replacement for
.Pn mailx
on systems with MH installations.
.Pn mhmail
should have been able to use just like
.Pn mailx ,
but sending the message via MH's
.Pn post
instead of
.Pn sendmail .
Using
.Pn mhmail
should not be influenced by the question whether the user had
MH set up for himself or not.
.Pn mhmail
did not read the profile as this requests the user to set up MH
if not done yet.
As
.Pn mhmail
used
.Pn post ,
.Pn post
could not read the profile neither.
This is the reason why
.Pn post
does not read the profile.
This is the reason for the actual problem.
It was not much of a problem because
.Pn post
was not intended to be used by users directly.
.Pn send
is the interactive front-end to
.Pn post .
.Pn send
read the profile and passed all relevant values on the command line to
.Pn post
\(en an awkward solution.
.P
The important insight is that
.Pn mhmail
is no true MH tool.
The concepts broke because this outlandish tool was treated as any other
MH tool.
Instead it should have been treated accordingly to its foreign style.
The solution is not to prevent the tools reading the profile but
to instruct them reading a different profile.
.Pn mhmail
could have set up a well-defined profile and caused all MH tools
in the session to use it by exporting an environment variable.
With this approach, no special cases would have been introduced,
no surprises would have been caused.
By writing a clean-profile-wrapper, the concept could have been
generalized orthogonally to the whole MH tool chest.
Then Rose's motivation behind the decision that
.Pn post
ignores the profile, as quoted by Jeffrey Honig,
would have become possible:
.[
nmh-workers post profile
.]
.QS
when you run mh commands in a script, you want all the defaults to be
what the man page says.
when you run a command by hand, then you want your own defaults...
.QE
.LP
Yet, I consider this explanation shortsighted.
We should rather regard theses two cases as just two different MH setups,
based on two different profiles.
Mapping such problems on the concepts of switching between different
profiles, solves them once for all.
.P
In mmh, the wish to have
.Pn mhmail
as a replacement for
.Pn mailx
is considered obsolete.
Mmh's
.Pn mhmail
does no longer cover this use-case.
Currently,
.Pn mhmail
is in a transition state.
.Ci 32d4f9daaa70519be3072479232ff7be0500d009
It may become a front-end to
.Pn comp ,
which provides an interface more convenient in some cases.
In this case,
.Pn mhmail
will become an ordinary MH tool, reading the profile.
If, however, this idea will not convince, then
.Pn mhmail
will be removed.
.P
Every program in the mmh tool chest reads the profile.
The only exception is
.Pn slocal ,
which is not considered part of the mmh tool chest.
This MDA is only distributed with mmh, currently.
Mmh has no
.Pn post
program, but
.Pn spost ,
which now reads the profile.
.Ci 3e017a7abbdf69bf0dff7a4073275961eda1ded8
With this change,
.Pn send
and
.Pn spost
can be considered to be merged.
.Pn spost
is only invoked directly by the to-be-changed
.Pn mhmail
implementation and by
.Pn rcvdist ,
which will require rework.
.P
The
.Fu context_foil()
function to pretend to have read an empty profile was removed.
.Ci 68af8da96bea87a5541988870130b6209ce396f6
All mmh tools read the profile.



.H2 "Standard Libraries
.P
MH is one decade older than the POSIX and ANSI C standards.
Hence, MH included own implementations of functions
that are standardized and thus widely available today,
but were not back then.
Today, twenty years after the POSIX and ANSI C were published,
developers can expect systems to comply with these standards.
In consequence, MH-specific replacements for standard functions
can and should be dropped.
Kernighan and Pike advise: ``Use standard libraries.''
.[ [
kernighan pike practice of programming
.], p. 196]
Actually, MH had followed this advice in history,
but it had not adjusted to the changes in this field.
The
.Fu snprintf()
function, for instance, was standardized with C99 and is available
almost everywhere because of its high usefulness.
The project's own implementation of
.Fu snprintf()
was dropped in March 2012 in favor for using the one of the
standard library.
.Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32
Such decisions limit the portability of mmh
if systems do not support these standardized and widespread functions.
This compromise is made because mmh focuses on the future.
.P
.\" XXX kuerzen und mit dem naechsten Absatz vereinen
I am still in my twenties and my C and Unix experience comprises
only half a dozen years.
Hence, I need to learn about the history in retrospective.
I have not used those ancient constructs myself.
I have not suffered from their incompatibilities.
I have not longed for standardization.
All my programming experience is from a time when ANSI C and POSIX
were well established already.
I have only read a lot of books about the (good) old times.
This puts me in a difficult position when working with old code.
I need to freshly acquire knowledge about old code constructs and ancient
programming styles, whereas older programmers know these things by
heart from their own experience.
.P
Being aware of the situation, I rather let people with more historic
experience replace ancient code constructs with standardized ones.
Lyndon Nerenberg covered large parts of this task for the nmh project.
He converted project-specific functions to POSIX replacements,
also removing the conditionals compilation of now standardized features.
Ken Hornstein and David Levine had their part in the work, too.
Often, I only needed to pull over changes from nmh into mmh.
These changes include many commits; these are among them:
.Ci 768b5edd9623b7238e12ec8dfc409b82a1ed9e2d
.Ci 0052f1024deb0a0a2fc2e5bacf93d45a5a9c9b32 .
.P
During my own work, I tidied up the \fIMH standard library\fP,
.Fn libmh.a ,
which is located in the
.Fn sbr
(``subroutines'') directory in the source tree.
The MH library includes functions that mmh tools usually need.
Among them are MH-specific functions for profile, context, sequence,
and folder handling, but as well
MH-independent functions, such as auxiliary string functions,
portability interfaces and error-checking wrappers for critical
functions of the standard library.
.P
I have replaced the
.Fu atooi()
function with calls to
.Fu strtoul()
with the third parameter, the base, set to eight.
.Fu strtoul()
is part of C89 and thus considered safe to use.
.Ci c490c51b3c0f8871b6953bd0c74551404f840a74
.P
I did remove project-included fallback implementations of
.Fu memmove()
and
.Fu strerror() ,
although Peter Maydell had re-included them into nmh in 2008
to support SunOS 4.
Nevertheless, these functions are part of ANSI C.
Systems that do not even provide full ANSI C support should not
put a load on mmh.
.Ci b067ff5c465a5d243ce5a19e562085a9a1a97215
.P
The
.Fu copy()
function copies the string in parameter one to the location in
parameter two.
In contrast to
.Fu strcpy() ,
it returns a pointer to the terminating null-byte in the destination area.
The code was adjusted to replace
.Fu copy()
with
.Fu strcpy() ,
except within
.Fu concat() ,
where
.Fu copy()
was more convenient.
Therefore, the definition of
.Fu copy()
was moved into the source file of
.Fu concat()
and its visibility is now limited to it.
.Ci 552fd7253e5ee9e554c5c7a8248a6322aa4363bb
.P
The function
.Fu r1bindex()
had been a generalized version of
.Fu basename()
with minor differences.
As all calls to
.Fu r1bindex()
had the slash (`/') as delimiter anyway,
replacing
.Fu r1bindex()
with the more specific and better-named function
.Fu basename()
became desirable.
Unfortunately, many of the 54 calls to
.Fu r1bindex()
depended on a special behavior,
which differed from the POSIX specification for
.Fu basename() .
Hence,
.Fu r1bindex()
was kept but renamed to
.Fu mhbasename() ,
fixing the delimiter to the slash.
.Ci 240013872c392fe644bd4f79382d9f5314b4ea60
For possible uses of
.Fu r1bindex()
with a different delimiter,
the ANSI C function
.Fu strrchr()
provides the core functionality.
.P
The
.Fu ssequal()
function \(en apparently for ``substring equal'' \(en
was renamed to
.Fu isprefix() ,
because this is what it actually checks.
.Ci c20b4fa14515c7ab388ce35411d89a7a92300711
Its source file had included the following comments, no joke.
.VS
/*
 * THIS CODE DOES NOT WORK AS ADVERTISED.
 * It is actually checking if s1 is a PREFIX of s2.
 * All calls to this function need to be checked to see
 * if that needs to be changed. Prefix checking is cheaper, so
 * should be kept if it's sufficient.
 */

/*
 * Check if s1 is a substring of s2.
 * If yes, then return 1, else return 0.
 */
VE
Two months later, it was completely removed by replacing it with
.Fu strncmp() .
.Ci b0b1dd37ff515578cf7cba51625189eb34a196cb





.H2 "User Data Locations
.P
In nmh, a personal setup consists of the MH profile and the MH directory.
The profile is a file named
.Fn \&.mh_profile
in the user's home directory.
It contains the static configuration.
It also contains the location of the MH directory in the profile entry
.Pe Path .
The MH directory contains the mail storage and is the first
place to search for personal forms, scan formats, and similar
configuration files.
The location of the MH directory can be chosen freely by the user.
The default and usual name is a directory named
.Fn Mail
in the home directory.
.P
The way MH data is splitted between profile and MH directory is a legacy.
It is only sensible in a situation where the profile is the only
configuration file.
Why else should the mail storage and the configuration files be intermixed?
They are different kinds of data:
The data to be operated on and the configuration to change how
tools operate.
.\" XXX bad ... inapropriate?
Splitting the configuration between the profile and the MH directory
is bad.
Merging the mail storage and the configuration in one directory is bad
as well.
As the mail storage and the configuration were not separated sensibly
in the first place, I did it now.
.P
Personal mmh data is grouped by type, resulting in two distinct parts:
the mail storage and the configuration.
In mmh, the mail storage directory still contains all the messages,
but, in exception of public sequences files, nothing else.
In difference to nmh, the auxiliary configuration files are no longer
located there.
Therefore, the directory is no longer called the user's \fIMH directory\fP
but his \fImail storage\fP.
Its location is still user-chosen, with the default name
.Fn Mail ,
in the user's home directory.
In mmh, the configuration is grouped together in
the hidden directory
.Fn \&.mmh
in the user's home directory.
This \fImmh directory\fP contains the context file, personal forms,
scan formats, and the like, but also the user's profile, now named
.Fn profile .
The location of the profile is no longer fixed to
.Fn $HOME/.mh_profile
but to
.Fn $HOME/.mmh/profile .
Having both the file
.Fn $HOME/.mh_profile
and the configuration directory
.Fn $HOME/.mmh
appeared to be inconsistent.
The approach chosen for mmh is consistent, simple, and familiar to
Unix users.
.Ci 7030d7edb099bff36ded7548bb5380f7acab4f9b
.P
MH allows users to have multiple MH setups.
Therefore, it is necessary to select a different profile.
The profile is the single entry point to access the rest of a
personal MH setup.
In nmh, the environment variable
.Ev MH
could be used to specifiy a different profile.
To operate in the same MH setup with a separate context,
the
.Ev MHCONTEXT
environment variable could be used.
This allows having own current folders and current messages in
each terminal, for instance.
In mmh, three environment variables are used.
.Ev MMH
overrides the default location of the mmh directory (\c
.Fn .mmh ).
.Ev MMHP
and
.Ev MMHC
override the paths to the profile and context files, respectively.
This approach allows the set of personal configuration files to be chosen
independently from the profile, context, and mail storage.
.Ci 7030d7edb099bff36ded7548bb5380f7acab4f9b
.P
The separation of the files by type is sensible and convenient.
The new approach has no functional disadvantages,
as every setup I can imagine can be implemented with both approaches,
possibly even easier with the new approach.
The main achievement of the change is the clear and sensible split
between mail storage and configuration.





.H2 "Modularization
.P
The source code of the mmh tools is located in the
.Fn uip
(``user interface programs'') directory.
Each tool has a source file with the name of the command.
For example,
.Pn rmm
is built from
.Fn uip/rmm.c .
Some source files are used for multiple programs.
For example
.Fn uip/scansbr.c
is used for both
.Pn scan
and
.Pn inc .
In nmh, 49 tools were built from 76 source files.
This is a ratio of 1.6 source files per program.
32 programs depended on multiple source files;
17 programs depended on one source file only.
In mmh, 39 tools are built from 51 source files.
This is a ratio of 1.3 source files per program.
18 programs depend on multiple source files;
21 programs depend on one source file only.
(These numbers and the ones in the following text ignore the MH library
as well as shell scripts and multiple names for the same program.)
.\" XXX graph
.P
Splitting the source code of a large program into multiple files can
increase the readability of its source code.
.\" XXX however?
Most of the mmh tools are simple and straight-forward programs.
With the exception of the MIME handling tools,
.Pn pick
is the largest tool.
It contains 1\|037 lines of source code, excluding the MH library.
Only the MIME handling tools (\c
.Pn mhbuild ,
.Pn mhstore ,
.Pn show ,
etc.)
are larger.
Splitting programs with less than 1\|000 lines of code into multiple
source files seldom leads to better readability.
For such tools, splitting makes sense
when parts of the code are reused in other programs,
and the reused code fragment is (1) not general enough
for including it in the MH library
or (2) has dependencies on a library that only few programs need.
.Fn uip/packsbr.c ,
for instance, provides the core program logic for the
.Pn packf
and
.Pn rcvpack
programs.
.Fn uip/packf.c
and
.Fn uip/rcvpack.c
mainly wrap the core function appropriately.
No other tools use the folder packing functions.
As another example,
.Fn uip/termsbr.c
provides termcap support, which requires linking with a termcap or
curses library.
Including
.Fn uip/termsbr.c
into the MH library would require every program to be linked with
termcap or curses, although only few of the programs require it.
.P
The task of MIME handling is complex enough that splitting its code
into multiple source files improves the readability.
The program
.Pn mhstore ,
for instance, is compiled out of seven source files with 2\|500
lines of code in summary.
The main code file
.Fn uip/mhstore.c
consists of 800 lines; the other 1\|700 lines of code are reused in
other MIME handling tools.
It seems to be worthwhile to bundle the generic MIME handling code into
a MH-MIME library, as a companion to the MH standard library.
This is left open for the future.
.P
The work already accomplished focussed on the non-MIME tools.
The amount of code compiled into each program was reduced.
This eases the understanding of the code base.
In nmh,
.Pn comp
was built from six source files:
.Fn comp.c ,
.Fn whatnowproc.c ,
.Fn whatnowsbr.c ,
.Fn sendsbr.c ,
.Fn annosbr.c ,
and
.Fn distsbr.c .
In mmh, it builds from only two:
.Fn comp.c
and
.Fn whatnowproc.c .
In nmh's
.Pn comp ,
the core function of
.Pn whatnow ,
.Pn send ,
and
.Pn anno
were compiled into
.Pn comp .
This saved the need to execute these programs with
.Fu fork()
and
.Fu exec() ,
two expensive system calls.
Whereas this approach improved the time performance,
it interwove the source code.
Core functionalities were not encapsulated into programs but into
function, which were then wrapped by programs.
For example,
.Fn uip/annosbr.c
included the function
.Fu annotate() .
Each program that wanted to annotate messages, included the source file
.Fn uip/annosbr.c
and called
.Fu annotate() .
Because the function
.Fu annotate()
was used like the tool
.Pn anno ,
it had seven parameters, reflecting the command line switches of the tool.
When another pair of command line switches was added to
.Pn anno ,
a rather ugly hack was implemented to avoid adding another parameter
to the function.
.Ci d9b1d57351d104d7ec1a5621f090657dcce8cb7f
.P
Separation simplifies the understanding of program code
because the area influenced by any particular statement is smaller.
The separating on the program-level is more strict than the separation
on the function level.
In mmh, the relevant code of
.Pn comp
comprises the two files
.Fn uip/comp.c
and
.Fn uip/whatnowproc.c ,
together 210 lines of code.
In nmh,
.Pn comp
comprises six files with 2\|450 lines.
Not all of the code in these six files was actually used by
.Pn comp ,
but the code reader needed to read all of the code first to know which
parts were used.
.P
As I have read a lot in the code base during the last two years,
I learned about the easy and the difficult parts.
Code is easy to understand if the influenced code area is small
and its boundaries are strictly defined.
Furthermore, the code needs to solve the problem in a straight-forward way.
.P
.\" XXX move this paragraph somewhere else?
Reading
.Pn rmm 's
source code in
.Fn uip/rmm.c
is my recommendation for a beginner's entry point into the code base of nmh.
The reasons are that the task of
.Pn rmm
is straight forward and it consists of one small source code file only,
yet its source includes code constructs typical for MH tools.
With the introduction of the trash folder in mmh,
.Pn rmm
became a bit more complex, because it invokes
.Pn refile .
Still, it is a good example for a simple tool with clear sources.
.P
Understanding
.Pn comp
.\" XXX kate fragen: more vs. as much
requires to read 210 lines of code in mmh, but ten times more in nmh.
Due to the aforementioned hack in
.Pn anno
to save the additional parameter, information passed through the program's
source base in obscure ways.
Thus, understanding
.Pn comp ,
required understanding the inner workings of
.Fn uip/annosbr.c
first.
To be sure to fully understand a program, its whole source code needs
to be examined.
Not doing so is a leap of faith, assuming that the developers
have avoided obscure programming techniques.
By separating the tools on the program-level, the boundaries are
clearly visible and technically enforced.
The interfaces are calls to
.Fu exec()
rather than arbitrary function calls.
.P
But the real problem is another:
Nmh violates the golden ``one tool, one job'' rule of the Unix philosophy.
.\" XXX ref
Understanding
.Pn comp
requires understanding
.Fn uip/annosbr.c
and
.Fn uip/sendsbr.c
because
.Pn comp
does annotate and send messages.
In nmh, there surely exists the tool
.Pn send ,
which does mainly send messages.
But
.Pn comp
and
.Pn repl
and
.Pn forw
and
.Pn dist
and
.Pn whatnow
and
.Pn viamail ,
they all (!) have the same message sending function included, as well.
In result,
.Pn comp
sends messages without using
.Pn send .
The situation is the same as if
.Pn grep
would page without
.Pn more
just because both programs are part of the same code base.
.P
The clear separation on the surface \(en the tool chest approach \(en
is violated on the level below.
This violation is for the sake of time performance.
On systems where
.Fu fork()
and
.Fu exec()
are expensive, the quicker response might be noticable.
In the old times, sacrificing readability and conceptional beauty for
speed might even have been a must to prevent MH from being unusably slow.
Whatever the reasons had been, today they are gone.
No longer should we sacrifice readability or conceptional beauty.
No longer should we violate the Unix philosophy's ``one tool, one job''
guideline.
.\" XXX ref
No longer should we keep speed improvements that became unnecessary.
.P
Therefore, mmh's
.Pn comp
does no longer send messages.
In mmh, different jobs are divided among separate programs that
invoke each other as needed.
In consequence,
.Pn comp
invokes
.Pn whatnow
which thereafter invokes
.Pn send .
.Ci 3df5ab3c116e6d4a2fb4bb5cc9dfc5f781825815
.Ci c73c00bfccd22ec77e9593f47462aeca4a8cd9c0
The clear separation on the surface is maintained on the level below.
Human users and the tools use the same interface \(en
annotations, for example, are made by invoking
.Pn anno ,
no matter if requested by programs or by human beings.
.Ci 469a4163c2a1a43731d412eaa5d9cae7d670c48b
.Ci aed384169af5204b8002d06e7a22f89197963d2d
.Ci 3caf9e298a8861729ca8b8a84f57022b6f3ea742
The decrease of tools built from multiple source files and thus
the decrease of
.Fn uip/*sbr.c
files confirm the improvement.
.Ci 9e6d91313f01c96b4058d6bf419a8ca9a207bc33
.ci 81744a46ac9f845d6c2b9908074d269275178d2e
.Ci f0f858069d21111f0dbea510044593f89c9b0829
.Ci 0503a6e9be34f24858b55b555a5c948182b9f24b
.Ci 27826f9353e0f0b04590b7d0f8f83e60462b90f0
.Ci d1da1f94ce62160aebb30df4063ccbc53768656b
.Ci c42222869e318fff5dec395eca3e776db3075455
.P
.\" XXX move this paragraph up somewhere
One disadvantage needs to be taken with this change:
The compiler can no longer check the integrity of the interfaces.
By changing the command line interfaces of tools, it is
the developer's job to adjust the invocations of these tools as well.
As this is a manual task and regression tests, which could detect such
problems, are not available yet, it is prone to errors.
These errors will not be detected at compile time but at run time.
Installing regression tests is a pending task.
In the best case, a uniform way of invoking tools from other tools
can be developed to allow automated testing at compile time.


.ig 
XXX consider writing about mhl vs. mhlproc

sbr/showfile.c

    23          /*
    24          ** If you have your lproc listed as "mhl",
    25          ** then really invoked the mhlproc instead
    26          ** (which is usually mhl anyway).
    27          */

Sat Nov 24 19:09:14 1984  /mtr (agent: Marshall Rose) <uci@udel-dewey>

        sbr/showfile.c: if lproc is "mhl", use mhlproc for consistency
        (Actually, user should use "lproc: show", "showproc: mhl".)
..