view ch03.roff @ 33:3e70450336a4

refer: Added refer; the bib contains various entries already.
author markus schnalke <meillo@marmaro.de>
date Tue, 15 May 2012 19:19:17 +0200
parents 1e4afeb34da7
children d3a02f5e63b3
line wrap: on
line source

.H0 "Work Report
.P
foo
.P
bar

.H1 "Removal of Code Relicts
.P
The code base of mmh originates from the late 70s, had been extensively
worked on in the mid 80s, and had been partly reorganized and extended
in the 90s. Relicts of all those times had gathered in the code base.
My goal was to remove any ancient code parts. One part of the task was
converting obsolete code constructs to standard constructs, the other part
was dropping obsolete functions.
.P
As I'm not even thirty years old and have no more than seven years of
Unix experience, I needed to learn about the history in retroperspective.
Older people likely have used those ancient constructs themself
and have suffered from their incompatiblities and have longed for
standardization. Unfortunately, I have only read that others had done so.
This put me in a much more difficult positions when working on the old
code. I needed to recherche what other would have known by heart from
experience. All my programming experience comes from a time past ANSI C
and past POSIX. Although I knew about the times before, I took the
current state implicitely for granted most of the time.
.P
Being aware of
these facts, I rather let people with more historic experience solve the 
task of converting the ancient code constructs to standardized ones.
Luckily, Lyndon Nerenberg focused on this task at the nmh project.
He converted large parts of the code to POSIX constructs, removing
the conditionals compilation for now standardized features.
I'm thankful for this task being solved. I only pulled the changes into
mmh.
.P
The other task \(en dropping ancient functionality to remove old code \(en
I did myself, though. My position to strip mmh to the bare minimum of
frequently used features is much more revolutional than the nmh community
likes it. Without the need to justify my decisions, I was able to quickly
remove functionality I considered ancient.
The need to discuss my decisions with
peers likely would have slowed this process down. Of course, I researched
if a particular feature really should be dropped. Having not had any
contact to this feature within my computer life was a first indicator to
drop it, but I also asked others and searched the literature for modern
usage of the feature. If it appeared to be truly ancient, I dropped it.
The reason for dropping is always part of the commit message in the
version control system. Thus, it is easy for others to check their
view on the topic with mine and possibly to argue for reinclusion.

.U2 "MMDF maildrop support
.P
I did drop any support for the MMDF maildrop format. This type of format
is conceptionally similar to the mbox format, but uses four bytes with
value 1 (\fL^A^A^A^A\fP) as message delimiter,
instead of the string ``\fLFrom\ \fP''.
Due to the similarity and mbox being the de-facto standard maildrop
format on Unix, but also due to the larger influence of Sendmail than MMDF,
the MMDF maildrop format had vanished.
.P
The simplifications within the code were only moderate. Switches could
be removed from tools like
.L packf ,
which generate packed mailboxes. Only one packed mailbox format remained:
mbox.
The most important changes affect the equally named mail parsing routine in
.L sbr/m_getfld.c .
The direct MMDF code had been removed, but as now only one packed mailbox
format is left, code structure simplifications are likely possible.
The reason why they are still outstanding is the heavily optimized code
of
.Fu m_getfld() .
Changes beyond a small local scope \(en
which restructuring in its core is \(en cause a high risk of damaging
the intricate workings of the optimized code. This problem is know
to the developers of nmh, too. They also avoid touching this minefield
if possible.

.U2 "UUCP Bang Paths
.P
More questionably than the former topic is the removal of support for the
UUCP bang path address style. However, the user may translate the bang
paths on retrieval to Internet addresses and the other way on posting
messages. The former can be done my an MDA like procmail; the latter
by a sendmail wrapper. This would ensure that any address handling would
work as expected. However, it might just work well without any
such modifications, as mmh does not touch addresses much, in general.
But I can't ensure as I have never used an environment with bang paths.
Also, the behavior might break at any point in further development.

.U2 "Hardcopy terminal support
.P
More of a funny anecdote is the remaining of a check for printing to a
hardcopy terminal until Spring 2012, when I finally removed it.
I surely would be very happy to see such a terminal in action, maybe
actually being able to work on it, but I fear my chances are null.
.P
The check only prevented a pager to be placed between the outputting
program (\c
.Pn mhl )
and the terminal. This could have been ensured with
the
.Sw \-nomoreproc
at the command line statically, too.

.U2 "Removed support for header fields
.P
The `Encrypted' header had been introduced by RFC\^822, but already
marked legacy in RFC 2822. It was superseded by FIXME.
Mmh does no more support this header.
.P
Native support for `Face' headers
had been removed, as well.
The feature is similar to the `X-Face' header in its intent,
but takes a different approach to store the image.
Instead of encoding the image data directly into the header,
the the header contains the hostname and UDP port where the image
date could be retrieved.
Neither `X-Face' nor the here described `Face' system
\**
.FS
There is also a newer but different system, invented 2005,
using `Face' headers.
It is the successor of `X-Face' providing colored PNG images.
.FE
became well used in the large scale.
It's still possible to use a Face systems,
although mmh does not provide support for any of the different systems
anymore. It's fairly easy to write a small shell script to 
extract the embedded or fetch the external Face data and display the image.
Own Face headers can be added into the draft template files.
.P
`Content-MD5' headers were introduced by RFC\^1864. They provide only
a verification of data corruption during the transfer. By no means can
they ensure verbatim end-to-end delivery of the contents. This is clearly
stated in the RFC. The proper approach to provide verificationability
of content in an end-to-end relationship is the use of digital cryptography
(RFCs FIXME). On the other hand, transfer protocols should ensure the
integrity of the transmission. In combinations these two approaches
make the `Content-MD5' header field useless. In consequence, I removed
the support for it. By this removal, MD5 computation is not needed
anywhere in mmh. Hence, over 500 lines of code were removed by this one
change. Even if the `Content-MD5' header field is useful sometimes,
I value its usefulnes less than the improvement in maintainability, caused
by the removal.

.U2 "Prompter's Control Keys
.P
The program
.Pn prompter
queries the user to fill in a message form. When used by
.Pn comp
as:
.DS
comp \-editor prompter
.DE
the resulting behavior is similar to
.Pn mailx .
Appearently,
.Pn prompter
hadn't been touched lately. Otherwise it's hardly explainable why it
still offered the switches
.Sn \-erase \fUchr\fP
and
.Sn \-kill \fUchr\fP
to name the characters for command line editing.
The times when this had been necessary are long time gone.
Today these things work out-of-the-box, and if not, are configured
with the standard tool
.Pn stty .

.U2 "Vfork and Retry Loops
.P
MH creates many processes, which is a concequence of the toolchest approach.
In earlier times
.Fu fork()
had been an expensive system call, as the process's whole image needed
to be duplicated. One common case is replacing the image with
.Fu exec()
right after having forked the child process.
To speed up this case, the
.Fu vfork()
system call was invented at Berkeley. It completely omits copying the
image. If the image gets replaced right afterwards then unnecessary
work is omited. On old systems this results in large speed ups.
MH uses
.Fu vfork()
whenever possible.
.P
Memory management units that support copy-on-write semantics make
.Fu fork()
almost as fast as
.Fu vfork()
in the cases when they can be exchanged.
With
.Fu vfork()
being more errorprone and hardly faster, it's preferable to simply
use
.Fu fork()
instead.
.P
Related to the costs of
.Fu fork()
is the probability of its success.
Today on modern systems, the system call will succeed almost always.
In the 80s on heavy loaded systems, as they were common at
universities, this had been different. Thus, many of the
.Fu fork()
calls were wrapped into loops to retry to fork several times in
short intervals, in case of previous failure.
In mmh, the program aborts at once if the fork failed.
The user can reexecute the command then. This is expected to be a
very rare case on modern systems, especially personal ones, which are
common today.


.H1 "Removal of Tools
.P
MH had been considered an all-in-one system for mail handling.
The community around nmh has a similar understanding.
In fundamental difference, I believe that mmh should be a MUA but
nothing more. I believe that all-in-one mail systems are not the way
to go. There are excellent specialized MTAs, like Postfix;
there are specialized MDAs, like Procmail; there are specialized
MRAs, like Fetchmail. I believe it's best to use them instead of
providing the same function ourself. Doing something well requires to
focus on this particular aspect or a small set of aspects. The more
it is possible to focus, the better the result in this particular
area will be. The limiting resource in Free Software community development
usually is human power. If the low development power is even parted
into multiple development areas, it will hardly be possible to 
compete with the specialists in the various fields. This is even
increased, given the small community \(en developers and users \(en
that MH-based mail systems have. In consequence, I believe that the
available resources should be concentrated at the point where MH is
most unique. This is clearly the MUA part.
.P
Several of nmh's tools were removed from mmh because they didn't
match the main focus of adding to the MUA's task.
.P
.Pn conflict
was removed because it is a mail system maintenance tool.
Besides, it also checks the
.Fn /etc/passwd
and
.Fn /etc/group
files.
The tool might be useful, but it should not be shipped with mmh.
.P
.Pn rcvtty
was removed because its usecase of writing to the user's terminal
on receival of mail is hardly wanted today. If users like to be
informed of new mail, then using the shell's
.Ev MAILPATH
variable or different (graphical) notifications are likely more
appealing. Writing directly to other terminals is hardly ever wanted
today. If though one wants to have it this way, the standard tool
.Pn write
can be used in a way similar to:
.DS
scan -file - | write `id -un`
.DE
.P
When the new attachment system was introduced,
.Pn viamail
was removed because then
.Pn forw
could cover the task itself.
The wrapper program
.Pn sendfiles
was rewritten as a shell script to use
.Pn forw .
.P
.Pn msgchk
was removed as it became hardly useful when POP support was removed.
It is questionable if
.Pn msgchk
provides more information than:
.DS
ls -l /var/mail/meillo
.DE
It does separate between old and new mail, but that's not very
useful and can be found out with
.Pn stat (1)
too. A very small shell script could care for the form of output.
As mmh's inc only incorporates mail from the user's local maildrop
and thus no long data transfers are involved,
there's no need to check for new mail before incorporating it.
.P
.Pn msh
was removed because the tool was in conflict with the original
philosophy of MH. It provided an interactive shell to access the
features of MH. One major feature of MH is being a toolchest.
.Pn msh
wouldn't be just another shell, taylored to the needs of mail
handling, but one large program to have the MH tools built in.
It's main use was for accessing Bulletin Boards, which have seized to
be popular. Removing
.Pn msh ,
together with the truly obsolete programs
.Pn vmh
and
.Pn wmh ,
saved more than 7\|000 lines of C code \(en a major achievement.


.H1 "Draft and Trash Folders
.U2 "Draft Folder
.P
Historically, MH provided exactly one draft message, named
.Fn draft
and
being located in the MH directory. When starting to compose another message
before the former one was sent, the user had been questioned wether to use,
refile or replace the old draft. Working on multiple drafts at the same time
was impossible. One could only work on them in alteration by refiling the
previous one to some directory and fetching some other one for reediting. 
This manual draft management needed to be done each time the user wanted
to switch between editing one draft to editing another.
.P
To allow true parallel editing of drafts, in a straight forward way, the
draft folder facility exists. It had been introduced already in July 1984
by Marshall T. Rose. The facility was deactivated by default.
Even in nmh, the draft folder facility remained deactivated by default.
At least, Richard Coleman added the man page
.Mp mh-draft(5)
to document
the feature well.
.P
The only advantage of not using the draft folder facility is the static
name of the draft file. This could be an issue for MH frontends like mh-e.
But as they likely want to provide working on multiple drafts in parallel,
the issue is only concerning compatibility. The aim of nmh to stay compatible
prevented the default activation of the draft folder facility.
.P
On the other hand, a draft folder is the much more natural concept than
a draft message. MH's mail storage consists of folders and messages,
the messages named with ascending numbers. A draft message breaks with this
concept by introducing a message in a file named
.Fn draft .
This draft
message is special. It can not be simply listed with the available tools,
but instead requires special switches. I.e. corner-cases were
introduced. A draft folder, in contrast, does not introduce such
corner-cases. The available tools can operate on the messages within that
folder like on any messages within any mail folders. The only difference
is the fact that the default folder for
.Pn send
is the draft folder,
instead of the current folder, like for all other tools.
.P
The trivial part of the change was activating the draft folder facility
by default and setting a default name for this folder. Obviously, I chose
the name
.Fn +drafts .
This made the
.Sw \-draftfolder
and
.Sw \-draftmessage
switches useless, and I could remove them.
The more difficult but also the part that showed the real improvement,
was updating the tools to the new concept.
.Sw \-draft
switches could
be dropped, as operating on a draft message became indistinguishable to
operating on any other message for the tools.
.Pn comp
still has its
.Sw \-use
switch for switching between its two modes: (1) Compose a new
draft, possibly by taking some existing message as a form. (2) Modify
an existing draft. In either case, the behavior of
.Pn comp is
deterministic. There is no more need to query the user. I consider this
a major improvement. By making
.Pn send
simply operate on the current
message in the draft folder by default, with message and folder both
overridable by specifying them on the command line, it is now possible
to send a draft anywhere within the storage by simply specifying its folder
and name.
.P
All theses changes converted special cases to regular cases, thus
simplifying the tools and increasing the flexibility.

.U2 "Trash Folder
.P
Similar to the situation for drafts is the situation for removed messages.
Historically, a message was deleted by renaming. A specific
\fIbackup prefix\fP, often comma (\c
.Fn , )
or hash (\c
.Fn # ),
being prepended to the file name. Thus, MH wouldn't recognize the file
as a message anymore, as only files whose name consists of digits only
are treated as messages. The removed messages remained as files in the
same directory and needed some maintenance job to truly delete them after
some grace time. Usually, by running a command similar to
.DS
find /home/user/Mail \-ctime +7 \-name ',*' | xargs rm
.DE
in a cron job. Within the grace time interval
the original message could be restored by stripping the
the backup prefix from the file name. If however, the last message of
a folder is been removed \(en say message
.Fn 6
becomes file
.Fn ,6
\(en and a new message enters the same folder, thus the same
numbered being given again \(en in our case
.Fn 6
\(en, if that one
is removed too, then the backup of the former message gets overwritten.
Thus, the ability to restore removed messages does not only depend on
the ``sweeping cron job'' but also on the removing of further messages.
This is undesireable, because the real mechanism is hidden from the user
and the concequences of further removals are not always obvious.
Further more, the backup files are scattered within the whole mail
storage, instead of being collected at one place.
.P
To improve the situation, the profile entry
.Pe rmmproc
(previously named
.Pe Delete-Prog )
was introduced, very early.
It could be set to any command, which would care for the mail removal
instead of taking the default action, described above.
Refiling the to-be-removed files to some wastebin folder was a common
example. Nmh's man page
.Mp rmm(1)
proposes
.Cl "refile +d
to move messages to the wastebin and
.Cl "rm `mhpath +d all`
the empty the wastebin.
Managing the message removal this way is a sane approach. It keeps
the removed messages in one place, makes it easy to remove the backup
files, and, most important, enables the user to use the tools of MH
itself to operate on the removed messages. One can
.Pn scan
them,
.Pn show
them, and restore them with
.Pn refile .
There's no more
need to use
.Pn mhpath
to switch over from MH tools to Unix tools \(en MH can do it all itself.
.P
This apporach matches perfect with the concepts of MH, thus making
it powerful. Hence, I made it the default. And even more, I also
removed the old backup prefix approach, as it is clearly less powerful.
Keeping unused alternative in the code is a bad choice as they likely
gather bugs, by not being constantly tested. Also, the increased code
size and more conditions crease the maintenance costs. By strictly
converting to the trash folder approach, I simplified the code base.
.Pn rmm
calls
.Pn refile
internally to move the to-be-removed
message to the trash folder (\c
.Fn +trash
by default). Messages
there can be operated on like on any other message in the storage.
The sweep clean, one can use
.Cl "rmm \-unlink +trash a" ,
where the
.Sw \-unlink
switch causes the files to be truly unliked instead
of moved to the trash folder.


.H1 "MH Directory Split
.P
In MH and nmh, a personal setup had consisted of two parts:
The MH profile, named
.Fn \&.mh_profile
and being located directly in the user's home directory.
And the MH directory, where all his mail messages and also his personal
forms, scan formats, other configuration files are stored. The location
of this directory could be user-chosen. The default was to name it
.Fn Mail
and have it directly in the home directory.
.P
I've never liked the data storage and the configuration to be intermixed.
They are different kinds of data. One part, are the messages,
which are the data to operate on. The other part, are the personal
configuration files, which are able to change the behavior of the operations.
The actual operations are defined in the profile, however.
.P
When storing data, one should try to group data by its type.
There's sense in the Unix file system hierarchy, where configuration
file are stored separate (\c
.Fn /etc )
to the programs (\c
.Fn /bin
and
.Fn /usr/bin )
to their sources (\c
.Fn /usr/src ).
Such separation eases the backup management, for instance.
.P
In mmh, I've reorganized the file locations.
Still there are two places:
There's the mail storage directory, which, like in MH, contains all the
messages, but, unlike in MH, nothing else.
Its location still is user-chosen, with the default name
.Fn Mail ,
in the user's home directory. This is much similar to the case in nmh.
The configuration files, however, are grouped together in the new directory
.Fn \&.mmh
in the user's home directory.
The user's profile now is a file, named
.Fn profile ,
in this mmh directory.
Consistently, the context file and all the personal forms, scan formats,
and the like, are also there.
.P
The naming changed with the relocation.
The directory where everything, except the profile, had been stored (\c
.Fn $HOME/Mail ),
used to be called \fIMH directory\fP. Now, this directory is called the
user's \fImail storage\fP. The name \fImmh directory\fP is now given to
the new directory
(\c
.Fn $HOME/.mmh ),
containing all the personal configuration files.
.P
The separation of the files by type of content is logical and convenient.
There are no functional differences as any possible setup known to me
can be implemented with both approaches, although likely a bit easier
with the new approach. The main goal of the change had been to provide
sensible storage locations for any type of personal mmh file.
.P
In order for one user to have multiple MH setups, he can use the
environment variable
.Ev MH
the point to a different profile file.
The MH directory (mail storage plus personal configuration files) is
defined by the
.Pe Path
profile entry.
The context file could be defined by the
.Pe context
profile entry or by the
.Ev MHCONTEXT
environment variable.
The latter is useful to have a distinct context (e.g. current folders)
in each terminal window, for instance.
In mmh, there are three environment variables now.
.Ev MMH
may be used to change the location of the mmh directory.
.Ev MMHP
and
.Ev MMHC
change the profile and context files, respectively.
Besides providing a more consistent feel (which simply is the result
of being designed anew), the set of personal configuration files can
be chosen independently from the profile (including mail storage location)
and context, now. Being it relevant for practical use or not, it
de-facto is an improvement. However, the main achievement is the
split between mail storage and personal configuration files.


.H1 "Path Notations
.P
foo

.H1 "Attachments
.P
foo

.H1 "mhshow to show Transition
.P
Since the very beginning, already in the first concept paper,
.Pn show
had been MH's mail display program.
.Pn show
found out which pathnames the relevant messages had and invoked
.Pn mhl
then to let it render the content.
With the advent of MIME, this approach wasn't sufficient anymore.
MIME messages can consist of multiple parts, some of which aren't
directly displayable, and text content can be encoded in
foreign charsets.
.Pn show 's
simple approach and
.Pn mhl 's
limited display facilities couldn't cope with the task any longer.
Instead of extending these tools, new ones were written from scratch
and then added to the MH toolchest. Doing so is encouraged by the
toolchest approach. The new tools could be added without interfering
with the existing ones. This is great. It allowed MH to be the
first MUA to implement MIME.
.P
The new MIME features were added in form of the single program
.Pn mhn .
The command
.DS
mhn \-show 42
.DE
would show the MIME message numbered 42.
With the 1.0 release of nmh in February 1999, Richard Coleman finished
the split of
.Pn mhn
into a set of specialized programs, which together covered the
aspects of MIME. One of these resulting tools was
.Pn mhshow .


.H1 "Blind Carbon Copies
.P
foo

.H1 "Good Defaults
.P
foo

.H1 "Modularization
.P
foo

.H1 "Code style
.P
foo