docs/master

view ch03.roff @ 22:99409e4546d2

Wrote about the removal of tools.
author markus schnalke <meillo@marmaro.de>
date Mon, 07 May 2012 17:27:57 +0200
parents bb8a8be49024
children 1e4afeb34da7
line source
1 .H0 "Work Report
2 .P
3 foo
4 .P
5 bar
7 .H1 "Removal of Code Relicts
8 .P
9 The code base of mmh originates from the late 70s, had been extensively
10 worked on in the mid 80s, and had been partly reorganized and extended
11 in the 90s. Relicts of all those times had gathered in the code base.
12 My goal was to remove any ancient code parts. One part of the task was
13 converting obsolete code constructs to standard constructs, the other part
14 was dropping obsolete functions.
15 .P
16 As I'm not even thirty years old and have no more than seven years of
17 Unix experience, I needed to learn about the history in retroperspective.
18 Older people likely have used those ancient constructs themself
19 and have suffered from their incompatiblities and have longed for
20 standardization. Unfortunately, I have only read that others had done so.
21 This put me in a much more difficult positions when working on the old
22 code. I needed to recherche what other would have known by heart from
23 experience. All my programming experience comes from a time past ANSI C
24 and past POSIX. Although I knew about the times before, I took the
25 current state implicitely for granted most of the time.
26 .P
27 Being aware of
28 these facts, I rather let people with more historic experience solve the
29 task of converting the ancient code constructs to standardized ones.
30 Luckily, Lyndon Nerenberg focused on this task at the nmh project.
31 He converted large parts of the code to POSIX constructs, removing
32 the conditionals compilation for now standardized features.
33 I'm thankful for this task being solved. I only pulled the changes into
34 mmh.
35 .P
36 The other task \(en dropping ancient functionality to remove old code \(en
37 I did myself, though. My position to strip mmh to the bare minimum of
38 frequently used features is much more revolutional than the nmh community
39 likes it. Without the need to justify my decisions, I was able to quickly
40 remove functionality I considered ancient.
41 The need to discuss my decisions with
42 peers likely would have slowed this process down. Of course, I researched
43 if a particular feature really should be dropped. Having not had any
44 contact to this feature within my computer life was a first indicator to
45 drop it, but I also asked others and searched the literature for modern
46 usage of the feature. If it appeared to be truly ancient, I dropped it.
47 The reason for dropping is always part of the commit message in the
48 version control system. Thus, it is easy for others to check their
49 view on the topic with mine and possibly to argue for reinclusion.
51 .U2 "MMDF maildrop support
52 .P
53 I did drop any support for the MMDF maildrop format. This type of format
54 is conceptionally similar to the mbox format, but uses four bytes with
55 value 1 (\fL^A^A^A^A\fP) as message delimiter,
56 instead of the string ``\fLFrom\ \fP''.
57 Due to the similarity and mbox being the de-facto standard maildrop
58 format on Unix, but also due to the larger influence of Sendmail than MMDF,
59 the MMDF maildrop format had vanished.
60 .P
61 The simplifications within the code were only moderate. Switches could
62 be removed from tools like
63 .L packf ,
64 which generate packed mailboxes. Only one packed mailbox format remained:
65 mbox.
66 The most important changes affect the equally named mail parsing routine in
67 .L sbr/m_getfld.c .
68 The direct MMDF code had been removed, but as now only one packed mailbox
69 format is left, code structure simplifications are likely possible.
70 The reason why they are still outstanding is the heavily optimized code
71 of
72 .Fu m_getfld() .
73 Changes beyond a small local scope \(en
74 which restructuring in its core is \(en cause a high risk of damaging
75 the intricate workings of the optimized code. This problem is know
76 to the developers of nmh, too. They also avoid touching this minefield
77 if possible.
79 .U2 "UUCP Bang Paths
80 .P
81 More questionably than the former topic is the removal of support for the
82 UUCP bang path address style. However, the user may translate the bang
83 paths on retrieval to Internet addresses and the other way on posting
84 messages. The former can be done my an MDA like procmail; the latter
85 by a sendmail wrapper. This would ensure that any address handling would
86 work as expected. However, it might just work well without any
87 such modifications, as mmh does not touch addresses much, in general.
88 But I can't ensure as I have never used an environment with bang paths.
89 Also, the behavior might break at any point in further development.
91 .U2 "Hardcopy terminal support
92 .P
93 More of a funny anecdote is the remaining of a check for printing to a
94 hardcopy terminal until Spring 2012, when I finally removed it.
95 I surely would be very happy to see such a terminal in action, maybe
96 actually being able to work on it, but I fear my chances are null.
97 .P
98 The check only prevented a pager to be placed between the outputting
99 program (\c
100 .Pn mhl )
101 and the terminal. This could have been ensured with
102 the
103 .Sw \-nomoreproc
104 at the command line statically, too.
106 .U2 "Removed support for header fields
107 .P
108 The `Encrypted' header had been introduced by RFC\^822, but already
109 marked legacy in RFC 2822. It was superseded by FIXME.
110 Mmh does no more support this header.
111 .P
112 Native support for `Face' headers
113 had been removed, as well.
114 The feature is similar to the `X-Face' header in its intent,
115 but takes a different approach to store the image.
116 Instead of encoding the image data directly into the header,
117 the the header contains the hostname and UDP port where the image
118 date could be retrieved.
119 Neither `X-Face' nor the here described `Face' system
120 \**
121 .FS
122 There is also a newer but different system, invented 2005,
123 using `Face' headers.
124 It is the successor of `X-Face' providing colored PNG images.
125 .FE
126 became well used in the large scale.
127 It's still possible to use a Face systems,
128 although mmh does not provide support for any of the different systems
129 anymore. It's fairly easy to write a small shell script to
130 extract the embedded or fetch the external Face data and display the image.
131 Own Face headers can be added into the draft template files.
132 .P
133 `Content-MD5' headers were introduced by RFC\^1864. They provide only
134 a verification of data corruption during the transfer. By no means can
135 they ensure verbatim end-to-end delivery of the contents. This is clearly
136 stated in the RFC. The proper approach to provide verificationability
137 of content in an end-to-end relationship is the use of digital cryptography
138 (RFCs FIXME). On the other hand, transfer protocols should ensure the
139 integrity of the transmission. In combinations these two approaches
140 make the `Content-MD5' header field useless. In consequence, I removed
141 the support for it. By this removal, MD5 computation is not needed
142 anywhere in mmh. Hence, over 500 lines of code were removed by this one
143 change. Even if the `Content-MD5' header field is useful sometimes,
144 I value its usefulnes less than the improvement in maintainability, caused
145 by the removal.
147 .U2 "Prompter's Control Keys
148 .P
149 The program
150 .Pn prompter
151 queries the user to fill in a message form. When used by
152 .Pn comp
153 as:
154 .DS
155 comp \-editor prompter
156 .DE
157 the resulting behavior is similar to
158 .Pn mailx .
159 Appearently,
160 .Pn prompter
161 hadn't been touched lately. Otherwise it's hardly explainable why it
162 still offered the switches
163 .Sn \-erase \fUchr\fP
164 and
165 .Sn \-kill \fUchr\fP
166 to name the characters for command line editing.
167 The times when this had been necessary are long time gone.
168 Today these things work out-of-the-box, and if not, are configured
169 with the standard tool
170 .Pn stty .
172 .U2 "Vfork and Retry Loops
173 .P
174 MH creates many processes, which is a concequence of the toolchest approach.
175 In earlier times
176 .Fu fork()
177 had been an expensive system call, as the process's whole image needed
178 to be duplicated. One common case is replacing the image with
179 .Fu exec()
180 right after having forked the child process.
181 To speed up this case, the
182 .Fu vfork()
183 system call was invented at Berkeley. It completely omits copying the
184 image. If the image gets replaced right afterwards then unnecessary
185 work is omited. On old systems this results in large speed ups.
186 MH uses
187 .Fu vfork()
188 whenever possible.
189 .P
190 Memory management units that support copy-on-write semantics make
191 .Fu fork()
192 almost as fast as
193 .Fu vfork()
194 in the cases when they can be exchanged.
195 With
196 .Fu vfork()
197 being more errorprone and hardly faster, it's preferable to simply
198 use
199 .Fu fork()
200 instead.
201 .P
202 Related to the costs of
203 .Fu fork()
204 is the probability of its success.
205 Today on modern systems, the system call will succeed almost always.
206 In the 80s on heavy loaded systems, as they were common at
207 universities, this had been different. Thus, many of the
208 .Fu fork()
209 calls were wrapped into loops to retry to fork several times in
210 short intervals, in case of previous failure.
211 In mmh, the program aborts at once if the fork failed.
212 The user can reexecute the command then. This is expected to be a
213 very rare case on modern systems, especially personal ones, which are
214 common today.
217 .H1 "Removal of Tools
218 .P
219 MH had been considered an all-in-one system for mail handling.
220 The community around nmh has a similar understanding.
221 In fundamental difference, I believe that mmh should be a MUA but
222 nothing more. I believe that all-in-one mail systems are not the way
223 to go. There are excellent specialized MTAs, like Postfix;
224 there are specialized MDAs, like Procmail; there are specialized
225 MRAs, like Fetchmail. I believe it's best to use them instead of
226 providing the same function ourself. Doing something well requires to
227 focus on this particular aspect or a small set of aspects. The more
228 it is possible to focus, the better the result in this particular
229 area will be. The limiting resource in Free Software community development
230 usually is human power. If the low development power is even parted
231 into multiple development areas, it will hardly be possible to
232 compete with the specialists in the various fields. This is even
233 increased, given the small community \(en developers and users \(en
234 that MH-based mail systems have. In consequence, I believe that the
235 available resources should be concentrated at the point where MH is
236 most unique. This is clearly the MUA part.
237 .P
238 Several of nmh's tools were removed from mmh because they didn't
239 match the main focus of adding to the MUA's task.
240 .P
241 .Pn conflict
242 was removed because it is a mail system maintenance tool.
243 Besides, it also checks the
244 .Fn /etc/passwd
245 and
246 .Fn /etc/group
247 files.
248 The tool might be useful, but it should not be shipped with mmh.
249 .P
250 .Pn rcvtty
251 was removed because its usecase of writing to the user's terminal
252 on receival of mail is hardly wanted today. If users like to be
253 informed of new mail, then using the shell's
254 .Ev MAILPATH
255 variable or different (graphical) notifications are likely more
256 appealing. Writing directly to other terminals is hardly ever wanted
257 today. If though one wants to have it this way, the standard tool
258 .Pn write
259 can be used in a way similar to:
260 .DS
261 scan -file - | write `id -un`
262 .DE
263 .P
264 When the new attachment system was introduced,
265 .Pn viamail
266 was removed because then
267 .Pn forw
268 could cover the task itself.
269 The wrapper program
270 .Pn sendfiles
271 was rewritten as a shell script to use
272 .Pn forw .
273 .P
274 .Pn msgchk
275 was removed as it became hardly useful when POP support was removed.
276 It is questionable if
277 .Pn msgchk
278 provides more information than:
279 .DS
280 ls -l /var/mail/meillo
281 .DE
282 It does separate between old and new mail, but that's not very
283 useful and can be found out with
284 .Pn stat (1)
285 too. A very small shell script could care for the form of output.
286 As mmh's inc only incorporates mail from the user's local maildrop
287 and thus no long data transfers are involved,
288 there's no need to check for new mail before incorporating it.
289 .P
290 .Pn msh
291 was removed because the tool was in conflict with the original
292 philosophy of MH. It provided an interactive shell to access the
293 features of MH. One major feature of MH is being a toolchest.
294 .Pn msh
295 wouldn't be just another shell, taylored to the needs of mail
296 handling, but one large program to have the MH tools built in.
297 It's main use was for accessing Bulletin Boards, which have seized to
298 be popular. Removing
299 .Pn msh ,
300 together with the truly obsolete programs
301 .Pn vmh
302 and
303 .Pn wmh ,
304 saved more than 7\|000 lines of C code \(en a major achievement.
307 .H1 "Draft and Trash Folders
308 .U2 "Draft Folder
309 .P
310 Historically, MH provided exactly one draft message, named
311 .Fn draft
312 and
313 being located in the MH directory. When starting to compose another message
314 before the former one was sent, the user had been questioned wether to use,
315 refile or replace the old draft. Working on multiple drafts at the same time
316 was impossible. One could only work on them in alteration by refiling the
317 previous one to some directory and fetching some other one for reediting.
318 This manual draft management needed to be done each time the user wanted
319 to switch between editing one draft to editing another.
320 .P
321 To allow true parallel editing of drafts, in a straight forward way, the
322 draft folder facility exists. It had been introduced already in July 1984
323 by Marshall T. Rose. The facility was deactivated by default.
324 Even in nmh, the draft folder facility remained deactivated by default.
325 At least, Richard Coleman added the man page
326 .Mp mh-draft(5)
327 to document
328 the feature well.
329 .P
330 The only advantage of not using the draft folder facility is the static
331 name of the draft file. This could be an issue for MH frontends like mh-e.
332 But as they likely want to provide working on multiple drafts in parallel,
333 the issue is only concerning compatibility. The aim of nmh to stay compatible
334 prevented the default activation of the draft folder facility.
335 .P
336 On the other hand, a draft folder is the much more natural concept than
337 a draft message. MH's mail storage consists of folders and messages,
338 the messages named with ascending numbers. A draft message breaks with this
339 concept by introducing a message in a file named
340 .Fn draft .
341 This draft
342 message is special. It can not be simply listed with the available tools,
343 but instead requires special switches. I.e. corner-cases were
344 introduced. A draft folder, in contrast, does not introduce such
345 corner-cases. The available tools can operate on the messages within that
346 folder like on any messages within any mail folders. The only difference
347 is the fact that the default folder for
348 .Pn send
349 is the draft folder,
350 instead of the current folder, like for all other tools.
351 .P
352 The trivial part of the change was activating the draft folder facility
353 by default and setting a default name for this folder. Obviously, I chose
354 the name
355 .Fn +drafts .
356 This made the
357 .Sw \-draftfolder
358 and
359 .Sw \-draftmessage
360 switches useless, and I could remove them.
361 The more difficult but also the part that showed the real improvement,
362 was updating the tools to the new concept.
363 .Sw \-draft
364 switches could
365 be dropped, as operating on a draft message became indistinguishable to
366 operating on any other message for the tools.
367 .Pn comp
368 still has its
369 .Sw \-use
370 switch for switching between its two modes: (1) Compose a new
371 draft, possibly by taking some existing message as a form. (2) Modify
372 an existing draft. In either case, the behavior of
373 .Pn comp is
374 deterministic. There is no more need to query the user. I consider this
375 a major improvement. By making
376 .Pn send
377 simply operate on the current
378 message in the draft folder by default, with message and folder both
379 overridable by specifying them on the command line, it is now possible
380 to send a draft anywhere within the storage by simply specifying its folder
381 and name.
382 .P
383 All theses changes converted special cases to regular cases, thus
384 simplifying the tools and increasing the flexibility.
386 .U2 "Trash Folder
387 .P
388 Similar to the situation for drafts is the situation for removed messages.
389 Historically, a message was deleted by renaming. A specific
390 \fIbackup prefix\fP, often comma (\c
391 .Fn , )
392 or hash (\c
393 .Fn # ),
394 being prepended to the file name. Thus, MH wouldn't recognize the file
395 as a message anymore, as only files whose name consists of digits only
396 are treated as messages. The removed messages remained as files in the
397 same directory and needed some maintenance job to truly delete them after
398 some grace time. Usually, by running a command similar to
399 .DS
400 find /home/user/Mail \-ctime +7 \-name ',*' | xargs rm
401 .DE
402 in a cron job. Within the grace time interval
403 the original message could be restored by stripping the
404 the backup prefix from the file name. If however, the last message of
405 a folder is been removed \(en say message
406 .Fn 6
407 becomes file
408 .Fn ,6
409 \(en and a new message enters the same folder, thus the same
410 numbered being given again \(en in our case
411 .Fn 6
412 \(en, if that one
413 is removed too, then the backup of the former message gets overwritten.
414 Thus, the ability to restore removed messages does not only depend on
415 the ``sweeping cron job'' but also on the removing of further messages.
416 This is undesireable, because the real mechanism is hidden from the user
417 and the concequences of further removals are not always obvious.
418 Further more, the backup files are scattered within the whole mail
419 storage, instead of being collected at one place.
420 .P
421 To improve the situation, the profile entry
422 .Pe rmmproc
423 (previously named
424 .Pe Delete-Prog )
425 was introduced, very early.
426 It could be set to any command, which would care for the mail removal
427 instead of taking the default action, described above.
428 Refiling the to-be-removed files to some wastebin folder was a common
429 example. Nmh's man page
430 .Mp rmm(1)
431 proposes
432 .Cl "refile +d
433 to move messages to the wastebin and
434 .Cl "rm `mhpath +d all`
435 the empty the wastebin.
436 Managing the message removal this way is a sane approach. It keeps
437 the removed messages in one place, makes it easy to remove the backup
438 files, and, most important, enables the user to use the tools of MH
439 itself to operate on the removed messages. One can
440 .Pn scan
441 them,
442 .Pn show
443 them, and restore them with
444 .Pn refile .
445 There's no more
446 need to use
447 .Pn mhpath
448 to switch over from MH tools to Unix tools \(en MH can do it all itself.
449 .P
450 This apporach matches perfect with the concepts of MH, thus making
451 it powerful. Hence, I made it the default. And even more, I also
452 removed the old backup prefix approach, as it is clearly less powerful.
453 Keeping unused alternative in the code is a bad choice as they likely
454 gather bugs, by not being constantly tested. Also, the increased code
455 size and more conditions crease the maintenance costs. By strictly
456 converting to the trash folder approach, I simplified the code base.
457 .Pn rmm
458 calls
459 .Pn refile
460 internally to move the to-be-removed
461 message to the trash folder (\c
462 .Fn +trash
463 by default). Messages
464 there can be operated on like on any other message in the storage.
465 The sweep clean, one can use
466 .Cl "rmm \-unlink +trash a" ,
467 where the
468 .Sw \-unlink
469 switch causes the files to be truly unliked instead
470 of moved to the trash folder.
473 .H1 "MH Directory Split
474 .P
475 In MH and nmh, a personal setup had consisted of two parts:
476 The MH profile, named
477 .Fn \&.mh_profile
478 and being located directly in the user's home directory.
479 And the MH directory, where all his mail messages and also his personal
480 forms, scan formats, other configuration files are stored. The location
481 of this directory could be user-chosen. The default was to name it
482 .Fn Mail
483 and have it directly in the home directory.
484 .P
485 I've never liked the data storage and the configuration to be intermixed.
486 They are different kinds of data. One part, are the messages,
487 which are the data to operate on. The other part, are the personal
488 configuration files, which are able to change the behavior of the operations.
489 The actual operations are defined in the profile, however.
490 .P
491 When storing data, one should try to group data by its type.
492 There's sense in the Unix file system hierarchy, where configuration
493 file are stored separate (\c
494 .Fn /etc )
495 to the programs (\c
496 .Fn /bin
497 and
498 .Fn /usr/bin )
499 to their sources (\c
500 .Fn /usr/src ).
501 Such separation eases the backup management, for instance.
502 .P
503 In mmh, I've reorganized the file locations.
504 Still there are two places:
505 There's the mail storage directory, which, like in MH, contains all the
506 messages, but, unlike in MH, nothing else.
507 Its location still is user-chosen, with the default name
508 .Fn Mail ,
509 in the user's home directory. This is much similar to the case in nmh.
510 The configuration files, however, are grouped together in the new directory
511 .Fn \&.mmh
512 in the user's home directory.
513 The user's profile now is a file, named
514 .Fn profile ,
515 in this mmh directory.
516 Consistently, the context file and all the personal forms, scan formats,
517 and the like, are also there.
518 .P
519 The naming changed with the relocation.
520 The directory where everything, except the profile, had been stored (\c
521 .Fn $HOME/Mail ),
522 used to be called \fIMH directory\fP. Now, this directory is called the
523 user's \fImail storage\fP. The name \fImmh directory\fP is now given to
524 the new directory
525 (\c
526 .Fn $HOME/.mmh ),
527 containing all the personal configuration files.
528 .P
529 The separation of the files by type of content is logical and convenient.
530 There are no functional differences as any possible setup known to me
531 can be implemented with both approaches, although likely a bit easier
532 with the new approach. The main goal of the change had been to provide
533 sensible storage locations for any type of personal mmh file.
534 .P
535 In order for one user to have multiple MH setups, he can use the
536 environment variable
537 .Ev MH
538 the point to a different profile file.
539 The MH directory (mail storage plus personal configuration files) is
540 defined by the
541 .Pe Path
542 profile entry.
543 The context file could be defined by the
544 .Pe context
545 profile entry or by the
546 .Ev MHCONTEXT
547 environment variable.
548 The latter is useful to have a distinct context (e.g. current folders)
549 in each terminal window, for instance.
550 In mmh, there are three environment variables now.
551 .Ev MMH
552 may be used to change the location of the mmh directory.
553 .Ev MMHP
554 and
555 .Ev MMHC
556 change the profile and context files, respectively.
557 Besides providing a more consistent feel (which simply is the result
558 of being designed anew), the set of personal configuration files can
559 be chosen independently from the profile (including mail storage location)
560 and context, now. Being it relevant for practical use or not, it
561 de-facto is an improvement. However, the main achievement is the
562 split between mail storage and personal configuration files.
565 .H1 "Path Notations
566 .P
567 foo
569 .H1 "Attachments
570 .P
571 foo
573 .H1 "mhshow to show Transition
574 .P
575 Since the very beginning, already in the first concept paper,
576 .Pn show
577 had been MH's mail display program.
578 .Pn show
579 found out which pathnames the relevant messages had and invoked
580 .Pn mhl
581 then to let it render the content.
582 With the advent of MIME, this approach wasn't sufficient anymore.
583 MIME messages can consist of multiple parts, some of which aren't
584 directly displayable, and text content can be encoded in
585 foreign charsets.
586 .Pn show 's
587 simple approach and
588 .Pn mhl 's
589 limited display facilities couldn't cope with the task any longer.
590 Instead of extending these tools, new ones were written from scratch
591 and then added to the MH toolchest. Doing so is encouraged by the
592 toolchest approach. The new tools could be added without interfearing
593 with the existing ones. This is great. It allowed MH to be the
594 first MUA to implement MIME.
595 .P
596 The new MIME features were added in form of the single program
597 .Pn mhn .
598 The command
599 .DS
600 mhn \-show 42
601 .DE
602 would show the MIME message numbered 42.
603 With the 1.0 release of nmh in February 1999, Richard Coleman finished
604 the split of
605 .Pn mhn
606 into a set of specialized programs, which together covered the
607 aspects of MIME. One of these resulting tools was
608 .Pn mhshow .
611 .H1 "Blind Carbon Copies
612 .P
613 foo
615 .H1 "Good Defaults
616 .P
617 foo
619 .H1 "Modularization
620 .P
621 foo
623 .H1 "Code style
624 .P
625 foo