docs/master

view ch03.roff @ 58:814c33b96d89

Restructured the content in ch03.
author markus schnalke <meillo@marmaro.de>
date Fri, 01 Jun 2012 12:33:49 +0200
parents 49cf68506b5d
children 6a92e0208de0
line source
1 .H0 "Discussion
2 .P
3 This main chapter discusses the practical work done in the mmh project.
4 It is structured along the goals to achieve. The concrete work done
5 is described in the examples of how the general goals were achieved.
10 .H1 "Stream-lining
13 .H2 "Removal of non-MUA Tools
14 .P
15 MH had been considered an all-in-one system for mail handling.
16 The community around nmh has a similar understanding.
17 In fundamental difference, I believe that mmh should be a MUA but
18 nothing more. I believe that all-in-one mail systems are not the way
19 to go. There are excellent specialized MTAs, like Postfix;
20 there are specialized MDAs, like Procmail; there are specialized
21 MRAs, like Fetchmail. I believe it's best to use them instead of
22 providing the same function ourselves. Doing something well requires to
23 focus on this particular aspect or a small set of aspects. The more
24 it is possible to focus, the better the result in this particular
25 area will be. The limiting resource in Free Software community development
26 usually is human power. If the low development power is even parted
27 into multiple development areas, it will hardly be possible to
28 compete with the specialists in the various fields. This is even
29 increased, given the small community \(en developers and users \(en
30 that MH-based mail systems have. In consequence, I believe that the
31 available resources should be concentrated at the point where MH is
32 most unique. This is clearly the MUA part.
33 .P
34 Several of nmh's tools were removed from mmh because they didn't
35 match the main focus of adding to the MUA's task.
36 .P
37 .Pn conflict
38 was removed because it is a mail system maintenance tool.
39 Besides, it also checks the
40 .Fn /etc/passwd
41 and
42 .Fn /etc/group
43 files.
44 The tool might be useful, but it should not be shipped with mmh.
45 .P
46 .Pn rcvtty
47 was removed because its usecase of writing to the user's terminal
48 on receiving of mail is hardly wanted today. If users like to be
49 informed of new mail, then using the shell's
50 .Ev MAILPATH
51 variable or different (graphical) notifications are likely more
52 appealing. Writing directly to other terminals is hardly ever wanted
53 today. If though one wants to have it this way, the standard tool
54 .Pn write
55 can be used in a way similar to:
56 .DS
57 scan -file - | write `id -un`
58 .DE
59 .P
60 When the new attachment system was introduced,
61 .Pn viamail
62 was removed because then
63 .Pn forw
64 could cover the task itself.
65 The wrapper program
66 .Pn sendfiles
67 was rewritten as a shell script to use
68 .Pn forw .
69 .P
70 .Pn msgchk
71 was removed as it became hardly useful when POP support was removed.
72 It is questionable if
73 .Pn msgchk
74 provides more information than:
75 .DS
76 ls -l /var/mail/meillo
77 .DE
78 It does separate between old and new mail, but that's not very
79 useful and can be found out with
80 .Pn stat (1)
81 too. A very small shell script could care for the form of output.
82 As mmh's inc only incorporates mail from the user's local maildrop
83 and thus no long data transfers are involved,
84 there's no need to check for new mail before incorporating it.
85 .P
86 .Pn msh
87 was removed because the tool was in conflict with the original
88 philosophy of MH. It provided an interactive shell to access the
89 features of MH. One major feature of MH is being a tool chest.
90 .Pn msh
91 wouldn't be just another shell, tailored to the needs of mail
92 handling, but one large program to have the MH tools built in.
93 It's main use was for accessing Bulletin Boards, which have seized to
94 be popular. Removing
95 .Pn msh ,
96 together with the truly obsolete programs
97 .Pn vmh
98 and
99 .Pn wmh ,
100 saved more than 7\|000 lines of C code \(en a major achievement.
102 .U2 "Removal of the MTS
103 .P
106 .H2 "mhshow show Merge
107 .P
108 Since the very beginning, already in the first concept paper,
109 .Pn show
110 had been MH's mail display program.
111 .Pn show
112 found out which pathnames the relevant messages had and invoked
113 .Pn mhl
114 then to let it render the content.
115 With the advent of MIME, this approach wasn't sufficient anymore.
116 MIME messages can consist of multiple parts, some of which aren't
117 directly displayable, and text content can be encoded in
118 foreign charsets.
119 .Pn show 's
120 simple approach and
121 .Pn mhl 's
122 limited display facilities couldn't cope with the task any longer.
123 Instead of extending these tools, new ones were written from scratch
124 and then added to the MH tool chest. Doing so is encouraged by the
125 tool chest approach. The new tools could be added without interfering
126 with the existing ones. This is great. It allowed MH to be the
127 first MUA to implement MIME.
128 .P
129 The new MIME features were added in form of the single program
130 .Pn mhn .
131 The command
132 .DS
133 mhn \-show 42
134 .DE
135 would show the MIME message numbered 42.
136 With the 1.0 release of nmh in February 1999, Richard Coleman finished
137 the split of
138 .Pn mhn
139 into a set of specialized programs, which together covered the
140 aspects of MIME. One of these resulting tools was
141 .Pn mhshow .
144 .H2 "Removal of Configure Options
145 .P
147 .H2 "Removal of switches
148 .P
153 .H1 "Moderizing
156 .H2 "Removal of Code Relicts
157 .P
158 The code base of mmh originates from the late Seventies,
159 had been extensively
160 worked on in the mid Eighties, and had been partly reorganized and extended
161 in the Nineties. Relicts of all those times had gathered in the code base.
162 My goal was to remove any ancient code parts. One part of the task was
163 converting obsolete code constructs to standard constructs, the other part
164 was dropping obsolete functions.
165 .P
166 As I'm not even thirty years old and have no more than seven years of
167 Unix experience, I needed to learn about the history in retrospective.
168 Older people likely have used those ancient constructs themselves
169 and have suffered from their incompatibilities and have longed for
170 standardization. Unfortunately, I have only read that others had done so.
171 This put me in a much more difficult positions when working on the old
172 code. I needed to recherche what other would have known by heart from
173 experience. All my programming experience comes from a time past ANSI C
174 and past POSIX. Although I knew about the times before, I took the
175 current state implicitly for granted most of the time.
176 .P
177 Being aware of
178 these facts, I rather let people with more historic experience solve the
179 task of converting the ancient code constructs to standardized ones.
180 Luckily, Lyndon Nerenberg focused on this task at the nmh project.
181 He converted large parts of the code to POSIX constructs, removing
182 the conditionals compilation for now standardized features.
183 I'm thankful for this task being solved. I only pulled the changes into
184 mmh.
185 .P
186 The other task \(en dropping ancient functionality to remove old code \(en
187 I did myself, though. My position to strip mmh to the bare minimum of
188 frequently used features is much more revolutional than the nmh community
189 likes it. Without the need to justify my decisions, I was able to quickly
190 remove functionality I considered ancient.
191 The need to discuss my decisions with
192 peers likely would have slowed this process down. Of course, I researched
193 if a particular feature really should be dropped. Having not had any
194 contact to this feature within my computer life was a first indicator to
195 drop it, but I also asked others and searched the literature for modern
196 usage of the feature. If it appeared to be truly ancient, I dropped it.
197 The reason for dropping is always part of the commit message in the
198 version control system. Thus, it is easy for others to check their
199 view on the topic with mine and possibly to argue for reinclusion.
201 .U2 "MMDF maildrop support
202 .P
203 I did drop any support for the MMDF maildrop format. This type of format
204 is conceptionally similar to the mbox format, but uses four bytes with
205 value 1 (\fL^A^A^A^A\fP) as message delimiter,
206 instead of the string ``\fLFrom\ \fP''.
207 Due to the similarity and mbox being the de-facto standard maildrop
208 format on Unix, but also due to the larger influence of Sendmail than MMDF,
209 the MMDF maildrop format had vanished.
210 .P
211 The simplifications within the code were only moderate. Switches could
212 be removed from tools like
213 .L packf ,
214 which generate packed mailboxes. Only one packed mailbox format remained:
215 mbox.
216 The most important changes affect the equally named mail parsing routine in
217 .L sbr/m_getfld.c .
218 The direct MMDF code had been removed, but as now only one packed mailbox
219 format is left, code structure simplifications are likely possible.
220 The reason why they are still outstanding is the heavily optimized code
221 of
222 .Fu m_getfld() .
223 Changes beyond a small local scope \(en
224 which restructuring in its core is \(en cause a high risk of damaging
225 the intricate workings of the optimized code. This problem is know
226 to the developers of nmh, too. They also avoid touching this minefield
227 if possible.
229 .U2 "UUCP Bang Paths
230 .P
231 More questionably than the former topic is the removal of support for the
232 UUCP bang path address style. However, the user may translate the bang
233 paths on retrieval to Internet addresses and the other way on posting
234 messages. The former can be done my an MDA like procmail; the latter
235 by a sendmail wrapper. This would ensure that any address handling would
236 work as expected. However, it might just work well without any
237 such modifications, as mmh does not touch addresses much, in general.
238 But I can't ensure as I have never used an environment with bang paths.
239 Also, the behavior might break at any point in further development.
241 .U2 "Hardcopy terminal support
242 .P
243 More of a funny anecdote is the remaining of a check for printing to a
244 hardcopy terminal until Spring 2012, when I finally removed it.
245 I surely would be very happy to see such a terminal in action, maybe
246 actually being able to work on it, but I fear my chances are null.
247 .P
248 The check only prevented a pager to be placed between the outputting
249 program (\c
250 .Pn mhl )
251 and the terminal. This could have been ensured with
252 the
253 .Sw \-nomoreproc
254 at the command line statically, too.
256 .U2 "Removed support for header fields
257 .P
258 The `Encrypted' header had been introduced by RFC\^822, but already
259 marked legacy in RFC 2822. It was superseded by FIXME.
260 Mmh does no more support this header.
261 .P
262 Native support for `Face' headers
263 had been removed, as well.
264 The feature is similar to the `X-Face' header in its intent,
265 but takes a different approach to store the image.
266 Instead of encoding the image data directly into the header,
267 the the header contains the hostname and UDP port where the image
268 date could be retrieved.
269 Neither `X-Face' nor the here described `Face' system
270 \**
271 .FS
272 There is also a newer but different system, invented 2005,
273 using `Face' headers.
274 It is the successor of `X-Face' providing colored PNG images.
275 .FE
276 became well used in the large scale.
277 It's still possible to use a Face systems,
278 although mmh does not provide support for any of the different systems
279 anymore. It's fairly easy to write a small shell script to
280 extract the embedded or fetch the external Face data and display the image.
281 Own Face headers can be added into the draft template files.
282 .P
283 `Content-MD5' headers were introduced by RFC\^1864. They provide only
284 a verification of data corruption during the transfer. By no means can
285 they ensure verbatim end-to-end delivery of the contents. This is clearly
286 stated in the RFC. The proper approach to provide verificationability
287 of content in an end-to-end relationship is the use of digital cryptography
288 (RFCs FIXME). On the other hand, transfer protocols should ensure the
289 integrity of the transmission. In combinations these two approaches
290 make the `Content-MD5' header field useless. In consequence, I removed
291 the support for it. By this removal, MD5 computation is not needed
292 anywhere in mmh. Hence, over 500 lines of code were removed by this one
293 change. Even if the `Content-MD5' header field is useful sometimes,
294 I value its usefulnes less than the improvement in maintainability, caused
295 by the removal.
297 .U2 "Prompter's Control Keys
298 .P
299 The program
300 .Pn prompter
301 queries the user to fill in a message form. When used by
302 .Pn comp
303 as:
304 .DS
305 comp \-editor prompter
306 .DE
307 the resulting behavior is similar to
308 .Pn mailx .
309 Apparently,
310 .Pn prompter
311 hadn't been touched lately. Otherwise it's hardly explainable why it
312 still offered the switches
313 .Sn \-erase \fUchr\fP
314 and
315 .Sn \-kill \fUchr\fP
316 to name the characters for command line editing.
317 The times when this had been necessary are long time gone.
318 Today these things work out-of-the-box, and if not, are configured
319 with the standard tool
320 .Pn stty .
322 .U2 "Vfork and Retry Loops
323 .P
324 MH creates many processes, which is a consequence of the tool chest approach.
325 In earlier times
326 .Fu fork()
327 had been an expensive system call, as the process's whole image needed
328 to be duplicated. One common case is replacing the image with
329 .Fu exec()
330 right after having forked the child process.
331 To speed up this case, the
332 .Fu vfork()
333 system call was invented at Berkeley. It completely omits copying the
334 image. If the image gets replaced right afterwards then unnecessary
335 work is omited. On old systems this results in large speed ups.
336 MH uses
337 .Fu vfork()
338 whenever possible.
339 .P
340 Memory management units that support copy-on-write semantics make
341 .Fu fork()
342 almost as fast as
343 .Fu vfork()
344 in the cases when they can be exchanged.
345 With
346 .Fu vfork()
347 being more error-prone and hardly faster, it's preferable to simply
348 use
349 .Fu fork()
350 instead.
351 .P
352 Related to the costs of
353 .Fu fork()
354 is the probability of its success.
355 Today on modern systems, the system call will succeed almost always.
356 In the Eighties on heavy loaded systems, as they were common at
357 universities, this had been different. Thus, many of the
358 .Fu fork()
359 calls were wrapped into loops to retry to fork several times in
360 short intervals, in case of previous failure.
361 In mmh, the program aborts at once if the fork failed.
362 The user can reexecute the command then. This is expected to be a
363 very rare case on modern systems, especially personal ones, which are
364 common today.
367 .H2 "Attachments
368 .P
369 MIME
372 .H2 "Digital Cryptography
373 .P
374 Signing and encryption.
377 .H2 "Good Defaults
378 .P
379 foo
384 .H1 "Code style
385 .P
386 foo
389 .H2 "Standard Code
390 .P
391 POSIX
394 .H2 "Separation
396 .U2 "MH Directory Split
397 .P
398 In MH and nmh, a personal setup had consisted of two parts:
399 The MH profile, named
400 .Fn \&.mh_profile
401 and being located directly in the user's home directory.
402 And the MH directory, where all his mail messages and also his personal
403 forms, scan formats, other configuration files are stored. The location
404 of this directory could be user-chosen. The default was to name it
405 .Fn Mail
406 and have it directly in the home directory.
407 .P
408 I've never liked the data storage and the configuration to be intermixed.
409 They are different kinds of data. One part, are the messages,
410 which are the data to operate on. The other part, are the personal
411 configuration files, which are able to change the behavior of the operations.
412 The actual operations are defined in the profile, however.
413 .P
414 When storing data, one should try to group data by its type.
415 There's sense in the Unix file system hierarchy, where configuration
416 file are stored separate (\c
417 .Fn /etc )
418 to the programs (\c
419 .Fn /bin
420 and
421 .Fn /usr/bin )
422 to their sources (\c
423 .Fn /usr/src ).
424 Such separation eases the backup management, for instance.
425 .P
426 In mmh, I've reorganized the file locations.
427 Still there are two places:
428 There's the mail storage directory, which, like in MH, contains all the
429 messages, but, unlike in MH, nothing else.
430 Its location still is user-chosen, with the default name
431 .Fn Mail ,
432 in the user's home directory. This is much similar to the case in nmh.
433 The configuration files, however, are grouped together in the new directory
434 .Fn \&.mmh
435 in the user's home directory.
436 The user's profile now is a file, named
437 .Fn profile ,
438 in this mmh directory.
439 Consistently, the context file and all the personal forms, scan formats,
440 and the like, are also there.
441 .P
442 The naming changed with the relocation.
443 The directory where everything, except the profile, had been stored (\c
444 .Fn $HOME/Mail ),
445 used to be called \fIMH directory\fP. Now, this directory is called the
446 user's \fImail storage\fP. The name \fImmh directory\fP is now given to
447 the new directory
448 (\c
449 .Fn $HOME/.mmh ),
450 containing all the personal configuration files.
451 .P
452 The separation of the files by type of content is logical and convenient.
453 There are no functional differences as any possible setup known to me
454 can be implemented with both approaches, although likely a bit easier
455 with the new approach. The main goal of the change had been to provide
456 sensible storage locations for any type of personal mmh file.
457 .P
458 In order for one user to have multiple MH setups, he can use the
459 environment variable
460 .Ev MH
461 the point to a different profile file.
462 The MH directory (mail storage plus personal configuration files) is
463 defined by the
464 .Pe Path
465 profile entry.
466 The context file could be defined by the
467 .Pe context
468 profile entry or by the
469 .Ev MHCONTEXT
470 environment variable.
471 The latter is useful to have a distinct context (e.g. current folders)
472 in each terminal window, for instance.
473 In mmh, there are three environment variables now.
474 .Ev MMH
475 may be used to change the location of the mmh directory.
476 .Ev MMHP
477 and
478 .Ev MMHC
479 change the profile and context files, respectively.
480 Besides providing a more consistent feel (which simply is the result
481 of being designed anew), the set of personal configuration files can
482 be chosen independently from the profile (including mail storage location)
483 and context, now. Being it relevant for practical use or not, it
484 de-facto is an improvement. However, the main achievement is the
485 split between mail storage and personal configuration files.
488 .H2 "Modularization
489 .P
490 whatnowproc
491 .P
492 The \fIMH library\fP
493 .Fn libmh.a
494 collects a bunch of standard functions that many of the MH tools need,
495 like reading the profile or context files.
496 This doesn't hurt the separation.
499 .H2 "Style
500 .P
501 Code layout, goto, ...
506 .H1 "Concept Exploitation/Homogeniety
509 .H2 "Draft Folder
510 .P
511 Historically, MH provided exactly one draft message, named
512 .Fn draft
513 and
514 being located in the MH directory. When starting to compose another message
515 before the former one was sent, the user had been questioned whether to use,
516 refile or replace the old draft. Working on multiple drafts at the same time
517 was impossible. One could only work on them in alteration by refiling the
518 previous one to some directory and fetching some other one for reediting.
519 This manual draft management needed to be done each time the user wanted
520 to switch between editing one draft to editing another.
521 .P
522 To allow true parallel editing of drafts, in a straight forward way, the
523 draft folder facility exists. It had been introduced already in July 1984
524 by Marshall T. Rose. The facility was deactivated by default.
525 Even in nmh, the draft folder facility remained deactivated by default.
526 At least, Richard Coleman added the man page
527 .Mp mh-draft(5)
528 to document
529 the feature well.
530 .P
531 The only advantage of not using the draft folder facility is the static
532 name of the draft file. This could be an issue for MH frontends like mh-e.
533 But as they likely want to provide working on multiple drafts in parallel,
534 the issue is only concerning compatibility. The aim of nmh to stay compatible
535 prevented the default activation of the draft folder facility.
536 .P
537 On the other hand, a draft folder is the much more natural concept than
538 a draft message. MH's mail storage consists of folders and messages,
539 the messages named with ascending numbers. A draft message breaks with this
540 concept by introducing a message in a file named
541 .Fn draft .
542 This draft
543 message is special. It can not be simply listed with the available tools,
544 but instead requires special switches. I.e. corner-cases were
545 introduced. A draft folder, in contrast, does not introduce such
546 corner-cases. The available tools can operate on the messages within that
547 folder like on any messages within any mail folders. The only difference
548 is the fact that the default folder for
549 .Pn send
550 is the draft folder,
551 instead of the current folder, like for all other tools.
552 .P
553 The trivial part of the change was activating the draft folder facility
554 by default and setting a default name for this folder. Obviously, I chose
555 the name
556 .Fn +drafts .
557 This made the
558 .Sw \-draftfolder
559 and
560 .Sw \-draftmessage
561 switches useless, and I could remove them.
562 The more difficult but also the part that showed the real improvement,
563 was updating the tools to the new concept.
564 .Sw \-draft
565 switches could
566 be dropped, as operating on a draft message became indistinguishable to
567 operating on any other message for the tools.
568 .Pn comp
569 still has its
570 .Sw \-use
571 switch for switching between its two modes: (1) Compose a new
572 draft, possibly by taking some existing message as a form. (2) Modify
573 an existing draft. In either case, the behavior of
574 .Pn comp is
575 deterministic. There is no more need to query the user. I consider this
576 a major improvement. By making
577 .Pn send
578 simply operate on the current
579 message in the draft folder by default, with message and folder both
580 overridable by specifying them on the command line, it is now possible
581 to send a draft anywhere within the storage by simply specifying its folder
582 and name.
583 .P
584 All theses changes converted special cases to regular cases, thus
585 simplifying the tools and increasing the flexibility.
588 .H2 "Trash Folder
589 .P
590 Similar to the situation for drafts is the situation for removed messages.
591 Historically, a message was deleted by renaming. A specific
592 \fIbackup prefix\fP, often comma (\c
593 .Fn , )
594 or hash (\c
595 .Fn # ),
596 being prepended to the file name. Thus, MH wouldn't recognize the file
597 as a message anymore, as only files whose name consists of digits only
598 are treated as messages. The removed messages remained as files in the
599 same directory and needed some maintenance job to truly delete them after
600 some grace time. Usually, by running a command similar to
601 .DS
602 find /home/user/Mail \-ctime +7 \-name ',*' | xargs rm
603 .DE
604 in a cron job. Within the grace time interval
605 the original message could be restored by stripping the
606 the backup prefix from the file name. If however, the last message of
607 a folder is been removed \(en say message
608 .Fn 6
609 becomes file
610 .Fn ,6
611 \(en and a new message enters the same folder, thus the same
612 numbered being given again \(en in our case
613 .Fn 6
614 \(en, if that one
615 is removed too, then the backup of the former message gets overwritten.
616 Thus, the ability to restore removed messages does not only depend on
617 the ``sweeping cron job'' but also on the removing of further messages.
618 This is undesirable, because the real mechanism is hidden from the user
619 and the consequences of further removals are not always obvious.
620 Further more, the backup files are scattered within the whole mail
621 storage, instead of being collected at one place.
622 .P
623 To improve the situation, the profile entry
624 .Pe rmmproc
625 (previously named
626 .Pe Delete-Prog )
627 was introduced, very early.
628 It could be set to any command, which would care for the mail removal
629 instead of taking the default action, described above.
630 Refiling the to-be-removed files to some garbage folder was a common
631 example. Nmh's man page
632 .Mp rmm(1)
633 proposes
634 .Cl "refile +d
635 to move messages to the garbage folder and
636 .Cl "rm `mhpath +d all`
637 the empty the garbage folder.
638 Managing the message removal this way is a sane approach. It keeps
639 the removed messages in one place, makes it easy to remove the backup
640 files, and, most important, enables the user to use the tools of MH
641 itself to operate on the removed messages. One can
642 .Pn scan
643 them,
644 .Pn show
645 them, and restore them with
646 .Pn refile .
647 There's no more
648 need to use
649 .Pn mhpath
650 to switch over from MH tools to Unix tools \(en MH can do it all itself.
651 .P
652 This approach matches perfect with the concepts of MH, thus making
653 it powerful. Hence, I made it the default. And even more, I also
654 removed the old backup prefix approach, as it is clearly less powerful.
655 Keeping unused alternative in the code is a bad choice as they likely
656 gather bugs, by not being constantly tested. Also, the increased code
657 size and more conditions crease the maintenance costs. By strictly
658 converting to the trash folder approach, I simplified the code base.
659 .Pn rmm
660 calls
661 .Pn refile
662 internally to move the to-be-removed
663 message to the trash folder (\c
664 .Fn +trash
665 by default). Messages
666 there can be operated on like on any other message in the storage.
667 The sweep clean, one can use
668 .Cl "rmm \-unlink +trash a" ,
669 where the
670 .Sw \-unlink
671 switch causes the files to be truly unliked instead
672 of moved to the trash folder.
675 .H2 "Path Notations
676 .P
677 foo
680 .H2 "MIME Integration
681 .P
682 user-visible access to whole messages and MIME parts are inherently
683 different
686 .H2 "Of One Cast
687 .P