docs/master

view ch03.roff @ 21:bb8a8be49024

Wrote about Face support and vfork().
author markus schnalke <meillo@marmaro.de>
date Mon, 07 May 2012 15:50:09 +0200
parents 7a100c80fa91
children 99409e4546d2
line source
1 .H0 "Work Report
2 .P
3 foo
4 .P
5 bar
7 .H1 "Removal of Code Relicts
8 .P
9 The code base of mmh originates from the late 70s, had been extensively
10 worked on in the mid 80s, and had been partly reorganized and extended
11 in the 90s. Relicts of all those times had gathered in the code base.
12 My goal was to remove any ancient code parts. One part of the task was
13 converting obsolete code constructs to standard constructs, the other part
14 was dropping obsolete functions.
15 .P
16 As I'm not even thirty years old and have no more than seven years of
17 Unix experience, I needed to learn about the history in retroperspective.
18 Older people likely have used those ancient constructs themself
19 and have suffered from their incompatiblities and have longed for
20 standardization. Unfortunately, I have only read that others had done so.
21 This put me in a much more difficult positions when working on the old
22 code. I needed to recherche what other would have known by heart from
23 experience. All my programming experience comes from a time past ANSI C
24 and past POSIX. Although I knew about the times before, I took the
25 current state implicitely for granted most of the time.
26 .P
27 Being aware of
28 these facts, I rather let people with more historic experience solve the
29 task of converting the ancient code constructs to standardized ones.
30 Luckily, Lyndon Nerenberg focused on this task at the nmh project.
31 He converted large parts of the code to POSIX constructs, removing
32 the conditionals compilation for now standardized features.
33 I'm thankful for this task being solved. I only pulled the changes into
34 mmh.
35 .P
36 The other task \(en dropping ancient functionality to remove old code \(en
37 I did myself, though. My position to strip mmh to the bare minimum of
38 frequently used features is much more revolutional than the nmh community
39 likes it. Without the need to justify my decisions, I was able to quickly
40 remove functionality I considered ancient.
41 The need to discuss my decisions with
42 peers likely would have slowed this process down. Of course, I researched
43 if a particular feature really should be dropped. Having not had any
44 contact to this feature within my computer life was a first indicator to
45 drop it, but I also asked others and searched the literature for modern
46 usage of the feature. If it appeared to be truly ancient, I dropped it.
47 The reason for dropping is always part of the commit message in the
48 version control system. Thus, it is easy for others to check their
49 view on the topic with mine and possibly to argue for reinclusion.
51 .U2 "MMDF maildrop support
52 .P
53 I did drop any support for the MMDF maildrop format. This type of format
54 is conceptionally similar to the mbox format, but uses four bytes with
55 value 1 (\fL^A^A^A^A\fP) as message delimiter,
56 instead of the string ``\fLFrom\ \fP''.
57 Due to the similarity and mbox being the de-facto standard maildrop
58 format on Unix, but also due to the larger influence of Sendmail than MMDF,
59 the MMDF maildrop format had vanished.
60 .P
61 The simplifications within the code were only moderate. Switches could
62 be removed from tools like
63 .L packf ,
64 which generate packed mailboxes. Only one packed mailbox format remained:
65 mbox.
66 The most important changes affect the equally named mail parsing routine in
67 .L sbr/m_getfld.c .
68 The direct MMDF code had been removed, but as now only one packed mailbox
69 format is left, code structure simplifications are likely possible.
70 The reason why they are still outstanding is the heavily optimized code
71 of
72 .Fu m_getfld() .
73 Changes beyond a small local scope \(en
74 which restructuring in its core is \(en cause a high risk of damaging
75 the intricate workings of the optimized code. This problem is know
76 to the developers of nmh, too. They also avoid touching this minefield
77 if possible.
79 .U2 "UUCP Bang Paths
80 .P
81 More questionably than the former topic is the removal of support for the
82 UUCP bang path address style. However, the user may translate the bang
83 paths on retrieval to Internet addresses and the other way on posting
84 messages. The former can be done my an MDA like procmail; the latter
85 by a sendmail wrapper. This would ensure that any address handling would
86 work as expected. However, it might just work well without any
87 such modifications, as mmh does not touch addresses much, in general.
88 But I can't ensure as I have never used an environment with bang paths.
89 Also, the behavior might break at any point in further development.
91 .U2 "Hardcopy terminal support
92 .P
93 More of a funny anecdote is the remaining of a check for printing to a
94 hardcopy terminal until Spring 2012, when I finally removed it.
95 I surely would be very happy to see such a terminal in action, maybe
96 actually being able to work on it, but I fear my chances are null.
97 .P
98 The check only prevented a pager to be placed between the outputting
99 program (\c
100 .Pn mhl )
101 and the terminal. This could have been ensured with
102 the
103 .Sw \-nomoreproc
104 at the command line statically, too.
106 .U2 "Removed support for header fields
107 .P
108 The `Encrypted' header had been introduced by RFC\^822, but already
109 marked legacy in RFC 2822. It was superseded by FIXME.
110 Mmh does no more support this header.
111 .P
112 Native support for `Face' headers
113 had been removed, as well.
114 The feature is similar to the `X-Face' header in its intent,
115 but takes a different approach to store the image.
116 Instead of encoding the image data directly into the header,
117 the the header contains the hostname and UDP port where the image
118 date could be retrieved.
119 Neither `X-Face' nor the here described `Face' system
120 \**
121 .FS
122 There is also a newer but different system, invented 2005,
123 using `Face' headers.
124 It is the successor of `X-Face' providing colored PNG images.
125 .FE
126 became well used in the large scale.
127 It's still possible to use a Face systems,
128 although mmh does not provide support for any of the different systems
129 anymore. It's fairly easy to write a small shell script to
130 extract the embedded or fetch the external Face data and display the image.
131 Own Face headers can be added into the draft template files.
132 .P
133 `Content-MD5' headers were introduced by RFC\^1864. They provide only
134 a verification of data corruption during the transfer. By no means can
135 they ensure verbatim end-to-end delivery of the contents. This is clearly
136 stated in the RFC. The proper approach to provide verificationability
137 of content in an end-to-end relationship is the use of digital cryptography
138 (RFCs FIXME). On the other hand, transfer protocols should ensure the
139 integrity of the transmission. In combinations these two approaches
140 make the `Content-MD5' header field useless. In consequence, I removed
141 the support for it. By this removal, MD5 computation is not needed
142 anywhere in mmh. Hence, over 500 lines of code were removed by this one
143 change. Even if the `Content-MD5' header field is useful sometimes,
144 I value its usefulnes less than the improvement in maintainability, caused
145 by the removal.
147 .U2 "Prompter's Control Keys
148 .P
149 The program
150 .Pn prompter
151 queries the user to fill in a message form. When used by
152 .Pn comp
153 as:
154 .DS
155 comp \-editor prompter
156 .DE
157 the resulting behavior is similar to
158 .Pn mailx .
159 Appearently,
160 .Pn prompter
161 hadn't been touched lately. Otherwise it's hardly explainable why it
162 still offered the switches
163 .Sn \-erase \fUchr\fP
164 and
165 .Sn \-kill \fUchr\fP
166 to name the characters for command line editing.
167 The times when this had been necessary are long time gone.
168 Today these things work out-of-the-box, and if not, are configured
169 with the standard tool
170 .Pn stty .
172 .U2 "Vfork and Retry Loops
173 .P
174 MH creates many processes, which is a concequence of the toolchest approach.
175 In earlier times
176 .Fu fork()
177 had been an expensive system call, as the process's whole image needed
178 to be duplicated. One common case is replacing the image with
179 .Fu exec()
180 right after having forked the child process.
181 To speed up this case, the
182 .Fu vfork()
183 system call was invented at Berkeley. It completely omits copying the
184 image. If the image gets replaced right afterwards then unnecessary
185 work is omited. On old systems this results in large speed ups.
186 MH uses
187 .Fu vfork()
188 whenever possible.
189 .P
190 Memory management units that support copy-on-write semantics make
191 .Fu fork()
192 almost as fast as
193 .Fu vfork()
194 in the cases when they can be exchanged.
195 With
196 .Fu vfork()
197 being more errorprone and hardly faster, it's preferable to simply
198 use
199 .Fu fork()
200 instead.
201 .P
202 Related to the costs of
203 .Fu fork()
204 is the probability of its success.
205 Today on modern systems, the system call will succeed almost always.
206 In the 80s on heavy loaded systems, as they were common at
207 universities, this had been different. Thus, many of the
208 .Fu fork()
209 calls were wrapped into loops to retry to fork several times in
210 short intervals, in case of previous failure.
211 In mmh, the program aborts at once if the fork failed.
212 The user can reexecute the command then. This is expected to be a
213 very rare case on modern systems, especially personal ones, which are
214 common today.
217 .H1 "Draft and Trash Folders
218 .U2 "Draft Folder
219 .P
220 Historically, MH provided exactly one draft message, named
221 .Fn draft
222 and
223 being located in the MH directory. When starting to compose another message
224 before the former one was sent, the user had been questioned wether to use,
225 refile or replace the old draft. Working on multiple drafts at the same time
226 was impossible. One could only work on them in alteration by refiling the
227 previous one to some directory and fetching some other one for reediting.
228 This manual draft management needed to be done each time the user wanted
229 to switch between editing one draft to editing another.
230 .P
231 To allow true parallel editing of drafts, in a straight forward way, the
232 draft folder facility exists. It had been introduced already in July 1984
233 by Marshall T. Rose. The facility was deactivated by default.
234 Even in nmh, the draft folder facility remained deactivated by default.
235 At least, Richard Coleman added the man page
236 .Mp mh-draft(5)
237 to document
238 the feature well.
239 .P
240 The only advantage of not using the draft folder facility is the static
241 name of the draft file. This could be an issue for MH frontends like mh-e.
242 But as they likely want to provide working on multiple drafts in parallel,
243 the issue is only concerning compatibility. The aim of nmh to stay compatible
244 prevented the default activation of the draft folder facility.
245 .P
246 On the other hand, a draft folder is the much more natural concept than
247 a draft message. MH's mail storage consists of folders and messages,
248 the messages named with ascending numbers. A draft message breaks with this
249 concept by introducing a message in a file named
250 .Fn draft .
251 This draft
252 message is special. It can not be simply listed with the available tools,
253 but instead requires special switches. I.e. corner-cases were
254 introduced. A draft folder, in contrast, does not introduce such
255 corner-cases. The available tools can operate on the messages within that
256 folder like on any messages within any mail folders. The only difference
257 is the fact that the default folder for
258 .Pn send
259 is the draft folder,
260 instead of the current folder, like for all other tools.
261 .P
262 The trivial part of the change was activating the draft folder facility
263 by default and setting a default name for this folder. Obviously, I chose
264 the name
265 .Fn +drafts .
266 This made the
267 .Sw \-draftfolder
268 and
269 .Sw \-draftmessage
270 switches useless, and I could remove them.
271 The more difficult but also the part that showed the real improvement,
272 was updating the tools to the new concept.
273 .Sw \-draft
274 switches could
275 be dropped, as operating on a draft message became indistinguishable to
276 operating on any other message for the tools.
277 .Pn comp
278 still has its
279 .Sw \-use
280 switch for switching between its two modes: (1) Compose a new
281 draft, possibly by taking some existing message as a form. (2) Modify
282 an existing draft. In either case, the behavior of
283 .Pn comp is
284 deterministic. There is no more need to query the user. I consider this
285 a major improvement. By making
286 .Pn send
287 simply operate on the current
288 message in the draft folder by default, with message and folder both
289 overridable by specifying them on the command line, it is now possible
290 to send a draft anywhere within the storage by simply specifying its folder
291 and name.
292 .P
293 All theses changes converted special cases to regular cases, thus
294 simplifying the tools and increasing the flexibility.
296 .U2 "Trash Folder
297 .P
298 Similar to the situation for drafts is the situation for removed messages.
299 Historically, a message was deleted by renaming. A specific
300 \fIbackup prefix\fP, often comma (\c
301 .Fn , )
302 or hash (\c
303 .Fn # ),
304 being prepended to the file name. Thus, MH wouldn't recognize the file
305 as a message anymore, as only files whose name consists of digits only
306 are treated as messages. The removed messages remained as files in the
307 same directory and needed some maintenance job to truly delete them after
308 some grace time. Usually, by running a command similar to
309 .DS
310 find /home/user/Mail \-ctime +7 \-name ',*' | xargs rm
311 .DE
312 in a cron job. Within the grace time interval
313 the original message could be restored by stripping the
314 the backup prefix from the file name. If however, the last message of
315 a folder is been removed \(en say message
316 .Fn 6
317 becomes file
318 .Fn ,6
319 \(en and a new message enters the same folder, thus the same
320 numbered being given again \(en in our case
321 .Fn 6
322 \(en, if that one
323 is removed too, then the backup of the former message gets overwritten.
324 Thus, the ability to restore removed messages does not only depend on
325 the ``sweeping cron job'' but also on the removing of further messages.
326 This is undesireable, because the real mechanism is hidden from the user
327 and the concequences of further removals are not always obvious.
328 Further more, the backup files are scattered within the whole mail
329 storage, instead of being collected at one place.
330 .P
331 To improve the situation, the profile entry
332 .Pe rmmproc
333 (previously named
334 .Pe Delete-Prog )
335 was introduced, very early.
336 It could be set to any command, which would care for the mail removal
337 instead of taking the default action, described above.
338 Refiling the to-be-removed files to some wastebin folder was a common
339 example. Nmh's man page
340 .Mp rmm(1)
341 proposes
342 .Cl "refile +d
343 to move messages to the wastebin and
344 .Cl "rm `mhpath +d all`
345 the empty the wastebin.
346 Managing the message removal this way is a sane approach. It keeps
347 the removed messages in one place, makes it easy to remove the backup
348 files, and, most important, enables the user to use the tools of MH
349 itself to operate on the removed messages. One can
350 .Pn scan
351 them,
352 .Pn show
353 them, and restore them with
354 .Pn refile .
355 There's no more
356 need to use
357 .Pn mhpath
358 to switch over from MH tools to Unix tools \(en MH can do it all itself.
359 .P
360 This apporach matches perfect with the concepts of MH, thus making
361 it powerful. Hence, I made it the default. And even more, I also
362 removed the old backup prefix approach, as it is clearly less powerful.
363 Keeping unused alternative in the code is a bad choice as they likely
364 gather bugs, by not being constantly tested. Also, the increased code
365 size and more conditions crease the maintenance costs. By strictly
366 converting to the trash folder approach, I simplified the code base.
367 .Pn rmm
368 calls
369 .Pn refile
370 internally to move the to-be-removed
371 message to the trash folder (\c
372 .Fn +trash
373 by default). Messages
374 there can be operated on like on any other message in the storage.
375 The sweep clean, one can use
376 .Cl "rmm \-unlink +trash a" ,
377 where the
378 .Sw \-unlink
379 switch causes the files to be truly unliked instead
380 of moved to the trash folder.
383 .H1 "MH Directory Split
384 .P
385 In MH and nmh, a personal setup had consisted of two parts:
386 The MH profile, named
387 .Fn \&.mh_profile
388 and being located directly in the user's home directory.
389 And the MH directory, where all his mail messages and also his personal
390 forms, scan formats, other configuration files are stored. The location
391 of this directory could be user-chosen. The default was to name it
392 .Fn Mail
393 and have it directly in the home directory.
394 .P
395 I've never liked the data storage and the configuration to be intermixed.
396 They are different kinds of data. One part, are the messages,
397 which are the data to operate on. The other part, are the personal
398 configuration files, which are able to change the behavior of the operations.
399 The actual operations are defined in the profile, however.
400 .P
401 When storing data, one should try to group data by its type.
402 There's sense in the Unix file system hierarchy, where configuration
403 file are stored separate (\c
404 .Fn /etc )
405 to the programs (\c
406 .Fn /bin
407 and
408 .Fn /usr/bin )
409 to their sources (\c
410 .Fn /usr/src ).
411 Such separation eases the backup management, for instance.
412 .P
413 In mmh, I've reorganized the file locations.
414 Still there are two places:
415 There's the mail storage directory, which, like in MH, contains all the
416 messages, but, unlike in MH, nothing else.
417 Its location still is user-chosen, with the default name
418 .Fn Mail ,
419 in the user's home directory. This is much similar to the case in nmh.
420 The configuration files, however, are grouped together in the new directory
421 .Fn \&.mmh
422 in the user's home directory.
423 The user's profile now is a file, named
424 .Fn profile ,
425 in this mmh directory.
426 Consistently, the context file and all the personal forms, scan formats,
427 and the like, are also there.
428 .P
429 The naming changed with the relocation.
430 The directory where everything, except the profile, had been stored (\c
431 .Fn $HOME/Mail ),
432 used to be called \fIMH directory\fP. Now, this directory is called the
433 user's \fImail storage\fP. The name \fImmh directory\fP is now given to
434 the new directory
435 (\c
436 .Fn $HOME/.mmh ),
437 containing all the personal configuration files.
438 .P
439 The separation of the files by type of content is logical and convenient.
440 There are no functional differences as any possible setup known to me
441 can be implemented with both approaches, although likely a bit easier
442 with the new approach. The main goal of the change had been to provide
443 sensible storage locations for any type of personal mmh file.
444 .P
445 In order for one user to have multiple MH setups, he can use the
446 environment variable
447 .Ev MH
448 the point to a different profile file.
449 The MH directory (mail storage plus personal configuration files) is
450 defined by the
451 .Pe Path
452 profile entry.
453 The context file could be defined by the
454 .Pe context
455 profile entry or by the
456 .Ev MHCONTEXT
457 environment variable.
458 The latter is useful to have a distinct context (e.g. current folders)
459 in each terminal window, for instance.
460 In mmh, there are three environment variables now.
461 .Ev MMH
462 may be used to change the location of the mmh directory.
463 .Ev MMHP
464 and
465 .Ev MMHC
466 change the profile and context files, respectively.
467 Besides providing a more consistent feel (which simply is the result
468 of being designed anew), the set of personal configuration files can
469 be chosen independently from the profile (including mail storage location)
470 and context, now. Being it relevant for practical use or not, it
471 de-facto is an improvement. However, the main achievement is the
472 split between mail storage and personal configuration files.
475 .H1 "Path Notations
476 .P
477 foo
479 .H1 "Attachments
480 .P
481 foo
483 .H1 "mhshow to show Transition
484 .P
485 Since the very beginning, already in the first concept paper,
486 .Pn show
487 had been MH's mail display program.
488 .Pn show
489 found out which pathnames the relevant messages had and invoked
490 .Pn mhl
491 then to let it render the content.
492 With the advent of MIME, this approach wasn't sufficient anymore.
493 MIME messages can consist of multiple parts, some of which aren't
494 directly displayable, and text content can be encoded in
495 foreign charsets.
496 .Pn show 's
497 simple approach and
498 .Pn mhl 's
499 limited display facilities couldn't cope with the task any longer.
500 Instead of extending these tools, new ones were written from scratch
501 and then added to the MH toolchest. Doing so is encouraged by the
502 toolchest approach. The new tools could be added without interfearing
503 with the existing ones. This is great. It allowed MH to be the
504 first MUA to implement MIME.
505 .P
506 The new MIME features were added in form of the single program
507 .Pn mhn .
508 The command
509 .DS
510 mhn \-show 42
511 .DE
512 would show the MIME message numbered 42.
513 With the 1.0 release of nmh in February 1999, Richard Coleman finished
514 the split of
515 .Pn mhn
516 into a set of specialized programs, which together covered the
517 aspects of MIME. One of these resulting tools was
518 .Pn mhshow .
521 .H1 "Blind Carbon Copies
522 .P
523 foo
525 .H1 "Good Defaults
526 .P
527 foo
529 .H1 "Modularization
530 .P
531 foo
533 .H1 "Code style
534 .P
535 foo