docs/unix-phil

view unix-phil.ms @ 30:ec17b3a969c7

various minor rework in ch04
author meillo@marmaro.de
date Wed, 24 Mar 2010 22:07:02 +0100
parents f0511a56416e
children 0caa9760fca8
line source
1 .\".if n .pl 1000i
2 .de XX
3 .pl 1v
4 ..
5 .em XX
6 .\".nr PI 0
7 .\".if t .nr PD .5v
8 .\".if n .nr PD 1v
9 .nr lu 0
10 .de CW
11 .nr PQ \\n(.f
12 .if t .ft CW
13 .ie ^\\$1^^ .if n .ul 999
14 .el .if n .ul 1
15 .if t .if !^\\$1^^ \&\\$1\f\\n(PQ\\$2
16 .if n .if \\n(.$=1 \&\\$1
17 .if n .if \\n(.$>1 \&\\$1\c
18 .if n .if \\n(.$>1 \&\\$2
19 ..
20 .ds [. \ [
21 .ds .] ]
22 .\"----------------------------------------
23 .TL
24 Why the Unix Philosophy still matters
25 .AU
26 markus schnalke <meillo@marmaro.de>
27 .AB
28 .ti \n(.iu
29 This paper discusses the importance of the Unix Philosophy in software design.
30 Today, few software designers are aware of these concepts,
31 and thus most modern software is limited and does not make use of software leverage.
32 Knowing and following the tenets of the Unix Philosophy makes software more valuable.
33 .AE
35 .\".if t .2C
37 .FS
38 .ps -1
39 This paper was prepared for the seminar ``Software Analysis'' at University Ulm.
40 Mentor was professor Schweiggert. 2010-02-05
41 .br
42 You may get this document from my website
43 .CW \s-1http://marmaro.de/docs
44 .FE
46 .NH 1
47 Introduction
48 .LP
49 Building a software is a process from an idea of the purpose of the software
50 to its release.
51 No matter \fIhow\fP the process is run, two things are common:
52 the initial idea and the release.
53 The process in between can be of any shape.
54 The the maintenance work after the release is ignored for the moment.
55 .PP
56 The process of building splits mainly in two parts:
57 the planning of what and how to build, and implementing the plan by writing code.
58 This paper focuses on the planning part \(en the designing of the software.
59 .PP
60 Software design is the plan of how the internals and externals of the software should look like,
61 based on the requirements.
62 This paper discusses the recommendations of the Unix Philosophy about software design.
63 .PP
64 The here discussed ideas can get applied by any development process.
65 The Unix Philosophy does recommend how the software development process should look like,
66 but this shall not be of matter here.
67 Similar, the question of how to write the code is out of focus.
68 .PP
69 The name ``Unix Philosophy'' was already mentioned several times, but it was not explained yet.
70 The Unix Philosophy is the essence of how the Unix operating system and its toolchest was designed.
71 It is no limited set of rules, but what people see to be common to typical Unix software.
72 Several people stated their view on the Unix Philosophy.
73 Best known are:
74 .IP \(bu
75 Doug McIlroy's summary: ``Write programs that do one thing and do it well.''
76 .[
77 %A M. D. McIlroy
78 %A E. N. Pinson
79 %A B. A. Taque
80 %T UNIX Time-Sharing System Forward
81 %J The Bell System Technical Journal
82 %D 1978
83 %V 57
84 %N 6
85 %P 1902
86 .]
87 .IP \(bu
88 Mike Gancarz' book ``The UNIX Philosophy''.
89 .[
90 %A Mike Gancarz
91 %T The UNIX Philosophy
92 %D 1995
93 %I Digital Press
94 .]
95 .IP \(bu
96 Eric S. Raymond's book ``The Art of UNIX Programming''.
97 .[
98 %A Eric S. Raymond
99 %T The Art of UNIX Programming
100 %D 2003
101 %I Addison-Wesley
102 %O .CW \s-1http://www.faqs.org/docs/artu/
103 .]
104 .LP
105 These different views on the Unix Philosophy have much in common.
106 Especially, the main concepts are similar for all of them.
107 But there are also points on which they differ.
108 This only underlines what the Unix Philosophy is:
109 A retrospective view on the main concepts of Unix software;
110 especially those that were successful and unique to Unix.
111 .\" really?
112 .PP
113 Before we will have a look at concrete concepts,
114 we discuss why software design is important
115 and what problems bad design introduces.
118 .NH 1
119 Importance of software design in general
120 .LP
121 Why should we design software at all?
122 It is general knowledge, that even a bad plan is better than no plan.
123 Ignoring software design is programming without a plan.
124 This will lead pretty sure to horrible results.
125 .PP
126 The design of a software is its internal and external shape.
127 The design talked about here has nothing to do with visual appearance.
128 If we see a program as a car, then its color is of no matter.
129 Its design would be the car's size, its shape, the number and position of doors,
130 the ratio of passenger and cargo transport, and so forth.
131 .PP
132 A software's design is about quality properties.
133 Each of the cars may be able to drive from A to B,
134 but it depends on its properties whether it is a good car for passenger transport or not.
135 It also depends on its properties if it is a good choice for a rough mountain area.
136 .PP
137 Requirements to a software are twofold: functional and non-functional.
138 Functional requirements are easier to define and to verify.
139 They are directly the software's functions.
140 Functional requirements are the reason why software gets written.
141 Someone has a problem and needs a tool to solve it.
142 Being able to solve the problem is the main functional requirement.
143 It is the driving force behind all programming effort.
144 .PP
145 On the other hand, there are also non-functional requirements.
146 They are called \fIquality\fP requirements, too.
147 The quality of a software is about properties that are not directly related to
148 the software's basic functions.
149 Quality aspects are about the properties that are overlooked at first sight.
150 .PP
151 Quality is of few matter when the software gets initially built,
152 but it will be of matter in usage and maintenance of the software.
153 A short-sighted might see in developing a software mainly building something up.
154 Reality shows, that building the software the first time is only a small amount
155 of the overall work.
156 Bug fixing, extending, rebuilding of parts \(en short: maintenance work \(en
157 does soon take over the major part of the time spent on a software.
158 Not to forget the usage of the software.
159 These processes are highly influenced by the software's quality.
160 Thus, quality should never be neglected.
161 The problem is that you hardly ``stumble over'' bad quality during the first build,
162 but this is the time when you should care about good quality most.
163 .PP
164 Software design is not about the basic function of a software;
165 this requirement will get satisfied anyway, as it is the main driving force behind the development.
166 Software design is about quality aspects of the software.
167 Good design will lead to good quality, bad design to bad quality.
168 The primary functions of the software will be affected modestly by bad quality,
169 but good quality can provide a lot of additional gain from the software,
170 even at places where one never expected it.
171 .PP
172 The ISO/IEC 9126-1 standard, part 1,
173 .[
174 %I International Organization for Standardization
175 %T ISO Standard 9126: Software Engineering \(en Product Quality, part 1
176 %C Geneve
177 %D 2001
178 .]
179 defines the quality model as consisting out of:
180 .IP \(bu
181 .I Functionality
182 (suitability, accuracy, inter\%operability, security)
183 .IP \(bu
184 .I Reliability
185 (maturity, fault tolerance, recoverability)
186 .IP \(bu
187 .I Usability
188 (understandability, learnability, operability, attractiveness)
189 .IP \(bu
190 .I Efficiency
191 (time behavior, resource utilization)
192 .IP \(bu
193 .I Maintainability
194 (analyzability, changeability, stability, testability)
195 .IP \(bu
196 .I Portability
197 (adaptability, installability, co-existence, replaceability)
198 .LP
199 These goals are parts of a software's design.
200 Good design can give these properties to a software,
201 bad designed software will miss them.
202 .PP
203 One further goal of software design is consistency.
204 Consistency eases understanding, working on, and using things.
205 Consistent internals and consistent interfaces to the outside can be provided by good design.
206 .PP
207 We should design software because good design avoids many problems during a software's lifetime.
208 And we should design software because good design can offer much gain,
209 that can be unrelated to the software main intend.
210 Indeed, we should spend much effort into good design to make the software more valuable.
211 The Unix Philosophy shows how to design software well.
212 It offers guidelines to achieve good quality and high gain for the effort spent.
215 .NH 1
216 The Unix Philosophy
217 .LP
218 The origins of the Unix Philosophy were already introduced.
219 This chapter explains the philosophy, oriented on Gancarz,
220 and shows concrete examples of its application.
222 .NH 2
223 Pipes
224 .LP
225 Following are some examples to demonstrate how applied Unix Philosophy feels like.
226 Knowledge of using the Unix shell is assumed.
227 .PP
228 Counting the number of files in the current directory:
229 .DS I 2n
230 .CW
231 .ps -1
232 ls | wc -l
233 .DE
234 The
235 .CW ls
236 command lists all files in the current directory, one per line,
237 and
238 .CW "wc -l
239 counts the number of lines.
240 .PP
241 Counting the number of files that do not contain ``foo'' in their name:
242 .DS I 2n
243 .CW
244 .ps -1
245 ls | grep -v foo | wc -l
246 .DE
247 Here, the list of files is filtered by
248 .CW grep
249 to remove all that contain ``foo''.
250 The rest is the same as in the previous example.
251 .PP
252 Finding the five largest entries in the current directory.
253 .DS I 2n
254 .CW
255 .ps -1
256 du -s * | sort -nr | sed 5q
257 .DE
258 .CW "du -s *
259 returns the recursively summed sizes of all files
260 \(en no matter if they are regular files or directories.
261 .CW "sort -nr
262 sorts the list numerically in reverse order.
263 Finally,
264 .CW "sed 5q
265 quits after it has printed the fifth line.
266 .PP
267 The presented command lines are examples of what Unix people would use
268 to get the desired output.
269 There are also other ways to get the same output.
270 It's a user's decision which way to go.
271 .PP
272 The examples show that many tasks on a Unix system
273 are accomplished by combining several small programs.
274 The connection between the single programs is denoted by the pipe operator `|'.
275 .PP
276 Pipes, and their extensive and easy use, are one of the great
277 achievements of the Unix system.
278 Pipes between programs have been possible in earlier operating systems,
279 but it has never been a so central part of the concept.
280 When, in the early seventies, Doug McIlroy introduced pipes for the
281 Unix system,
282 ``it was this concept and notation for linking several programs together
283 that transformed Unix from a basic file-sharing system to an entirely new way of computing.''
284 .[
285 %T Unix: An Oral History
286 %O .CW \s-1http://www.princeton.edu/~hos/frs122/unixhist/finalhis.htm
287 .]
288 .PP
289 Being able to specify pipelines in an easy way is,
290 however, not enough by itself.
291 It is only one half.
292 The other is the design of the programs that are used in the pipeline.
293 They have to interfaces that allows them to be used in such a way.
295 .NH 2
296 Interface design
297 .LP
298 Unix is, first of all, simple \(en Everything is a file.
299 Files are sequences of bytes, without any special structure.
300 Programs should be filters, which read a stream of bytes from ``standard input'' (stdin)
301 and write a stream of bytes to ``standard output'' (stdout).
302 .PP
303 If the files \fIare\fP sequences of bytes,
304 and the programs \fIare\fP filters on byte streams,
305 then there is exactly one standardized data interface.
306 Thus it is possible to combine them in any desired way.
307 .PP
308 Even a handful of small programs will yield a large set of combinations,
309 and thus a large set of different functions.
310 This is leverage!
311 If the programs are orthogonal to each other \(en the best case \(en
312 then the set of different functions is greatest.
313 .PP
314 Programs might also have a separate control interface,
315 besides their data interface.
316 The control interface is often called ``user interface'',
317 because it is usually designed to be used by humans.
318 The Unix Philosophy discourages to assume the user to be human.
319 Interactive use of software is slow use of software,
320 because the program waits for user input most of the time.
321 Interactive software requires the user to be in front of the computer
322 all the time.
323 Interactive software occupy the user's attention while they are running.
324 .PP
325 Now we come back to the idea of using several small programs, combined,
326 to have a more specific function.
327 If these single tools would all be interactive,
328 how would the user control them?
329 It is not only a problem to control several programs at once if they run at the same time,
330 it also very inefficient to have to control each of the single programs
331 that are intended to work as one large program.
332 Hence, the Unix Philosophy discourages programs to demand interactive use.
333 The behavior of programs should be defined at invocation.
334 This is done by specifying arguments (``command line switches'') to the program call.
335 Gancarz discusses this topic as ``avoid captive user interfaces''.
336 .[
337 %A Mike Gancarz
338 %T The UNIX Philosophy
339 %I Digital Press
340 %D 1995
341 %P 88 ff.
342 .]
343 .PP
344 Non-interactive use is, during development, also an advantage for testing.
345 Testing of interactive programs is much more complicated,
346 than testing of non-interactive programs.
348 .NH 2
349 The toolchest approach
350 .LP
351 A toolchest is a set of tools.
352 Instead of having one big tool for all tasks, one has many small tools,
353 each for one task.
354 Difficult tasks are solved by combining several of the small, simple tools.
355 .PP
356 The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs
357 that are filters on byte streams.
358 They are, to a large extend, unrelated in their function.
359 Hence, the Unix toolchest provides a large set of functions
360 that can be accessed by combining the programs in the desired way.
361 .PP
362 There are also advantages for developing small toolchest programs.
363 It is easier and less error-prone to write small programs.
364 It is also easier and less error-prone to write a large set of small programs,
365 than to write one large program with all the functionality included.
366 If the small programs are combinable, then they offer even a larger set
367 of functions than the single large program.
368 Hence, one gets two advantages out of writing small, combinable programs.
369 .PP
370 There are two drawbacks of the toolchest approach.
371 First, one simple, standardized, unidirectional interface has to be sufficient.
372 If one feels the need for more ``logic'' than a stream of bytes,
373 then a different approach might be of need.
374 But it is also possible, that he just can not imagine a design where
375 a stream of bytes is sufficient.
376 By becoming more familiar with the ``Unix style of thinking'',
377 developers will more often and easier find simple designs where
378 a stream of bytes is a sufficient interface.
379 .PP
380 The second drawback of a toolchest affects the users.
381 A toolchest is often more difficult to use for novices.
382 It is necessary to become familiar with each of the tools,
383 to be able to use the right one in a given situation.
384 Additionally, one needs to combine the tools in a senseful way on its own.
385 This is like a sharp knife \(en it is a powerful tool in the hand of a master,
386 but of no good value in the hand of an unskilled.
387 .PP
388 However, learning single, small tool of the toolchest is easier than
389 learning a complex tool.
390 The user will have a basic understanding of a yet unknown tool,
391 if the several tools of the toolchest have a common style.
392 He will be able to transfer knowledge over one tool to another.
393 .PP
394 Moreover, the second drawback can be removed easily by adding wrappers
395 around the single tools.
396 Novice users do not need to learn several tools if a professional wraps
397 the single commands into a more high-level script.
398 Note that the wrapper script still calls the small tools;
399 the wrapper script is just like a skin around.
400 No complexity is added this way,
401 but new programs can get created out of existing one with very low effort.
402 .PP
403 A wrapper script for finding the five largest entries in the current directory
404 could look like this:
405 .DS I 2n
406 .CW
407 .ps -1
408 #!/bin/sh
409 du -s * | sort -nr | sed 5q
410 .DE
411 The script itself is just a text file that calls the command line
412 a professional user would type in directly.
413 Making the program flexible on the number of entries it prints,
414 is easily possible:
415 .DS I 2n
416 .CW
417 .ps -1
418 #!/bin/sh
419 num=5
420 [ $# -eq 1 ] && num="$1"
421 du -sh * | sort -nr | sed "${num}q"
422 .DE
423 This script acts like the one before, when called without an argument.
424 But one can also specify a numerical argument to define the number of lines to print.
426 .NH 2
427 A powerful shell
428 .LP
429 It was already said, that the Unix shell provides the possibility to
430 combine small programs into large ones easily.
431 A powerful shell is a great feature in other ways, too.
432 .PP
433 For instance by including a scripting language.
434 The control statements are build into the shell.
435 The functions, however, are the normal programs, everyone can use on the system.
436 Thus, the programs are known, so learning to program in the shell is easy.
437 Using normal programs as functions in the shell programming language
438 is only possible because they are small and combinable tools in a toolchest style.
439 .PP
440 The Unix shell encourages to write small scripts out of other programs,
441 because it is so easy to do.
442 This is a great step towards automation.
443 It is wonderful if the effort to automate a task equals the effort
444 it takes to do it the second time by hand.
445 If it is so, then the user will be happy to automate everything he does more than once.
446 .PP
447 Small programs that do one job well, standardized interfaces between them,
448 a mechanism to combine parts to larger parts, and an easy way to automate tasks,
449 this will inevitably produce software leverage.
450 Getting multiple times the benefit of an investment is a great offer.
451 .PP
452 The shell also encourages rapid prototyping.
453 Many well known programs started as quickly hacked shell scripts,
454 and turned into ``real'' programs, written in C, later.
455 Building a prototype first is a way to avoid the biggest problems
456 in application development.
457 Fred Brooks writes in ``No Silver Bullet'':
458 .[
459 %A Frederick P. Brooks, Jr.
460 %T No Silver Bullet: Essence and Accidents of Software Engineering
461 %B Information Processing 1986, the Proceedings of the IFIP Tenth World Computing Conference
462 %E H.-J. Kugler
463 %D 1986
464 %P 1069\(en1076
465 %I Elsevier Science B.V.
466 %C Amsterdam, The Netherlands
467 .]
468 .QP
469 The hardest single part of building a software system is deciding precisely what to build.
470 No other part of the conceptual work is so difficult as establishing the detailed
471 technical requirements, [...].
472 No other part of the work so cripples the resulting system if done wrong.
473 No other part is more difficult to rectify later.
474 .PP
475 Writing a prototype is a great method to become familiar with the requirements
476 and to actually run into real problems.
477 Today, prototyping is often seen as a first step in building a software.
478 This is, of course, good.
479 However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping:
480 After having built the prototype, one might notice, that the prototype is already
481 \fIgood enough\fP.
482 Hence, no reimplementation, in a more sophisticated programming language, might be of need,
483 for the moment.
484 Maybe later, it might be necessary to rewrite the software, but not now.
485 .PP
486 By delaying further work, one keeps the flexibility to react easily on
487 changing requirements.
488 Software parts that are not written will not miss the requirements.
490 .NH 2
491 Worse is better
492 .LP
493 The Unix Philosophy aims for the 80% solution;
494 others call it the ``Worse is better'' approach.
495 .PP
496 First, practical experience shows, that it is almost never possible to define the
497 requirements completely and correctly the first time.
498 Hence one should not try to; it will fail anyway.
499 Second, practical experience shows, that requirements change during time.
500 Hence it is best to delay requirement-based design decisions as long as possible.
501 Also, the software should be small and flexible as long as possible
502 to react on changing requirements.
503 Shell scripts, for example, are more easily adjusted as C programs.
504 Third, practical experience shows, that maintenance is hard work.
505 Hence, one should keep the amount of software as small as possible;
506 it should just fulfill the \fIcurrent\fP requirements.
507 Software parts that will be written later, do not need maintenance now.
508 .PP
509 Starting with a prototype in a scripting language has several advantages:
510 .IP \(bu
511 As the initial effort is low, one will likely start right away.
512 .IP \(bu
513 As working parts are available soon, the real requirements can get identified soon.
514 .IP \(bu
515 When a software is usable, it gets used, and thus tested.
516 Hence problems will be found at early stages of the development.
517 .IP \(bu
518 The prototype might be enough for the moment,
519 thus further work on the software can be delayed to a time
520 when one knows better about the requirements and problems,
521 than now.
522 .IP \(bu
523 Implementing now only the parts that are actually needed now,
524 requires fewer maintenance work.
525 .IP \(bu
526 If the global situation changes so that the software is not needed anymore,
527 then less effort was spent into the project, than it would have be
528 when a different approach had been used.
530 .NH 2
531 Upgrowth and survival of software
532 .LP
533 So far it was talked about \fIwriting\fP or \fIbuilding\fP software.
534 Although these are just verbs, they do imply a specific view on the work process
535 they describe.
536 The better verb, however, is to \fIgrow\fP.
537 .PP
538 Creating software in the sense of the Unix Philosophy is an incremental process.
539 It starts with a first prototype, which evolves as requirements change.
540 A quickly hacked shell script might become a large, sophisticated,
541 compiled program this way.
542 Its lifetime begins with the initial prototype and ends when the software is not used anymore.
543 While being alive it will get extended, rearranged, rebuilt (from scratch).
544 Growing software matches the view that ``software is never finished. It is only released.''
545 .[
546 %O FIXME
547 %A Mike Gancarz
548 %T The UNIX Philosophy
549 %P 26
550 .]
551 .PP
552 Software can be seen as being controlled by evolutionary processes.
553 Successful software is software that is used by many for a long time.
554 This implies that the software is needed, useful, and better than alternatives.
555 Darwin talks about: ``The survival of the fittest.''
556 .[
557 %O FIXME
558 %A Charles Darwin
559 .]
560 Transferred to software: The most successful software, is the fittest,
561 is the one that survives.
562 (This may be at the level of one creature, or at the level of one species.)
563 The fitness of software is affected mainly by four properties:
564 portability of code, portability of data, range of usability, and reusability of parts.
565 .\" .IP \(bu
566 .\" portability of code
567 .\" .IP \(bu
568 .\" portability of data
569 .\" .IP \(bu
570 .\" range of usability
571 .\" .IP \(bu
572 .\" reuseability of parts
573 .PP
574 (1)
575 .I "Portability of code
576 means, using high-level programming languages,
577 sticking to the standard,
578 and avoiding optimizations that introduce dependencies on specific hardware.
579 Hardware has a much lower lifetime than software.
580 By chaining software to a specific hardware,
581 the software's lifetime gets shortened to that of this hardware.
582 In contrast, software should be easy to port \(en
583 adaptation is the key to success.
584 .\" cf. practice of prog: ch08
585 .PP
586 (2)
587 .I "Portability of data
588 is best achieved by avoiding binary representations
589 to store data, because binary representations differ from machine to machine.
590 Textual representation is favored.
591 Historically, ASCII was the charset of choice.
592 In the future, UTF-8 might be the better choice, however.
593 Important is that it is a plain text representation in a
594 very common charset encoding.
595 Apart from being able to transfer data between machines,
596 readable data has the great advantage, that humans are able
597 to directly edit it with text editors and other tools from the Unix toolchest.
598 .\" gancarz tenet 5
599 .PP
600 (3)
601 A large
602 .I "range of usability
603 ensures good adaptation, and thus good survival.
604 It is a special distinction if a software becomes used in fields of action,
605 the original authors did never imagine.
606 Software that solves problems in a general way will likely be used
607 for all kinds of similar problems.
608 Being too specific limits the range of uses.
609 Requirements change through time, thus use cases change or even vanish.
610 A good example in this point is Allman's sendmail.
611 Allman identifies flexibility to be one major reason for sendmail's success:
612 .[
613 %O FIXME
614 %A Allman
615 %T sendmail
616 .]
617 .QP
618 Second, I limited myself to the routing function [...].
619 This was a departure from the dominant thought of the time, [...].
620 .QP
621 Third, the sendmail configuration file was flexible enough to adopt
622 to a rapidly changing world [...].
623 .LP
624 Successful software adopts itself to the changing world.
625 .PP
626 (4)
627 .I "Reuse of parts
628 is even one step further.
629 A software may completely lose its field of action,
630 but parts of which the software is build may be general and independent enough
631 to survive this death.
632 If software is build by combining small independent programs,
633 then there are parts readily available for reuse.
634 Who cares if the large program is a failure,
635 but parts of it become successful instead?
637 .NH 2
638 Summary
639 .LP
640 This chapter explained the central ideas of the Unix Philosophy.
641 For each of the ideas, it was exposed what advantages they introduce.
642 The Unix Philosophy are guidelines that help to write valuable software.
643 From the view point of a software developer or software designer,
644 the Unix Philosophy provides answers to many software design problem.
645 .PP
646 The various ideas of the Unix Philosophy are very interweaved
647 and can hardly be applied independently.
648 However, the probably most important messages are:
649 .I "``Do one thing well!''" ,
650 .I "``Keep it simple!''" ,
651 and
652 .I "``Use software leverage!''
656 .NH 1
657 Case study: \s-1MH\s0
658 .LP
659 The previous chapter introduced and explained the Unix Philosophy
660 from a general point of view.
661 The driving force were the guidelines; references to
662 existing software were given only sparsely.
663 In this and the next chapter, concrete software will be
664 the driving force in the discussion.
665 .PP
666 This first case study is about the mail user agents (\s-1MUA\s0)
667 \s-1MH\s0 (``mail handler'') and its descendent \fInmh\fP
668 (``new mail handler'').
669 \s-1MUA\s0s provide functions to read, compose, and organize mail,
670 but (ideally) not to transfer.
671 In this document, the name \s-1MH\s0 will be used for both of them.
672 A distinction will only be made if differences between
673 them are described.
676 .NH 2
677 Historical background
678 .LP
679 Electronic mail was available in Unix very early.
680 The first \s-1MUA\s0 on Unix was \f(CWmail\fP,
681 which was already present in the First Edition.
682 .[
683 %A Peter H. Salus
684 %T A Quarter Century of UNIX
685 %D 1994
686 %I Addison-Wesley
687 %P 41 f.
688 .]
689 It was a small program that either prints the user's mailbox file
690 or appends text to someone elses mailbox file,
691 depending on the command line arguments.
692 .[
693 %O http://cm.bell-labs.com/cm/cs/who/dmr/pdfs/man12.pdf
694 .]
695 It was a program that did one job well.
696 This job was emailing, which was very simple then.
697 .PP
698 Later, emailing became more powerful, and thus more complex.
699 The simple \f(CWmail\fP, which knew nothing of subjects,
700 independent handling of single messages,
701 and long-time storage of them, was not powerful enough anymore.
702 At Berkeley, Kurt Shoens wrote \fIMail\fP (with capital `M')
703 in 1978 to provide additional functions for emailing.
704 Mail was still one program, but now it was large and did
705 several jobs.
706 Its user interface is modeled after the one of \fIed\fP.
707 It is designed for humans, but is still scriptable.
708 \fImailx\fP is the adaptation of Berkeley Mail into System V.
709 .[
710 %A Gunnar Ritter
711 %O http://heirloom.sourceforge.net/mailx_history.html
712 .]
713 Elm, pine, mutt, and a whole bunch of graphical \s-1MUA\s0s
714 followed Mail's direction.
715 They are large, monolithic programs which include all emailing functions.
716 .PP
717 A different way was taken by the people of \s-1RAND\s0 Corporation.
718 In the beginning, they also had used a monolitic mail system,
719 called \s-1MS\s0 (for ``mail system'').
720 But in 1977, Stockton Gaines and Norman Shapiro
721 came up with a proposal of a new email system concept \(en
722 one that honors the Unix Philosophy.
723 The concept was implemented by Bruce Borden in 1978 and 1979.
724 This was the birth of \s-1MH\s0 \(en the ``mail handler''.
725 .PP
726 Since then, \s-1RAND\s0, the University of California at Irvine and
727 at Berkeley, and several others have contributed to the software.
728 However, it's core concepts remained the same.
729 In the late 90s, when development of \s-1MH\s0 slowed down,
730 Richard Coleman started with \fInmh\fP, the new mail handler.
731 His goal was to improve \s-1MH\s0, especially in regard of
732 the requirements of modern emailing.
733 Today, nmh is developed by various people on the Internet.
734 .[
735 %T RAND and the Information Evolution: A History in Essays and Vignettes
736 %A Willis H. Ware
737 %D 2008
738 %I The RAND Corporation
739 %P 128\(en137
740 %O .CW \s-1http://www.rand.org/pubs/corporate_pubs/CP537/
741 .]
742 .[
743 %T MH & xmh: Email for Users & Programmers
744 %A Jerry Peek
745 %D 1991, 1992, 1995
746 %I O'Reilly & Associates, Inc.
747 %P Appendix B
748 %O Also available online: \f(CW\s-2http://rand-mh.sourceforge.net/book/\fP
749 .]
751 .NH 2
752 Contrasts to monolithic mail systems
753 .LP
754 All \s-1MUA\s0s are monolithic, except \s-1MH\s0.
755 Although there might acutally exist further, very little known,
756 toolchest \s-1MUA\s0s, this statement reflects the situation pretty well.
757 .PP
758 Monolithic \s-1MUA\s0s gather all their functions in one program.
759 In contrast, \s-1MH\s0 is a toolchest of many small tools \(en one for each job.
760 Following is a list of important programs of \s-1MH\s0's toolchest
761 and their function.
762 It gives a feeling of how the toolchest looks like.
763 .IP \(bu
764 .CW inc :
765 incorporate new mail (this is how mail enters the system)
766 .IP \(bu
767 .CW scan :
768 list messages in folder
769 .IP \(bu
770 .CW show :
771 show message
772 .IP \(bu
773 .CW next\fR/\fPprev :
774 show next/previous message
775 .IP \(bu
776 .CW folder :
777 change current folder
778 .IP \(bu
779 .CW refile :
780 refile message into folder
781 .IP \(bu
782 .CW rmm :
783 remove message
784 .IP \(bu
785 .CW comp :
786 compose a new message
787 .IP \(bu
788 .CW repl :
789 reply to a message
790 .IP \(bu
791 .CW forw :
792 forward a message
793 .IP \(bu
794 .CW send :
795 send a prepared message (this is how mail leaves the system)
796 .LP
797 \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have.
798 The user does not leave the shell to run \s-1MH\s0,
799 but he uses the various \s-1MH\s0 programs within the shell.
800 Using a monolithic program with a captive user interface
801 means ``entering'' the program, using it, and ``exiting'' the program.
802 Using toolchests like \s-1MH\s0 means running programs,
803 alone or in combinition with others, even from other toolchests,
804 without leaving the shell.
806 .NH 2
807 Data storage
808 .LP
809 \s-1MH\s0's mail storage is (only little more than) a directory tree
810 where mail folders are directories and mail messages are text files.
811 Working with \s-1MH\s0's toolchest is much like working
812 with Unix' toolchest:
813 \f(CWscan\fP is like \f(CWls\fP,
814 \f(CWshow\fP is like \f(CWcat\fP,
815 \f(CWfolder\fP is like \f(CWcd\fP/\f(CWpwd\fP,
816 \f(CWrefile\fP is like \f(CWmv\fP,
817 and \f(CWrmm\fP is like \f(CWrm\fP.
818 .PP
819 The context of tools in Unix is mainly the current working directory,
820 the user identification, and the environment variables.
821 \s-1MH\s0 extends this context by two more items:
822 .IP \(bu
823 The current mail folder, which is similar to the current working directory.
824 For mail folders, \f(CWfolder\fP provides the corresponding functionality
825 of \f(CWpwd\fP and \f(CWcd\fP for directories.
826 .IP \(bu
827 The current message, relative to the current mail folder,
828 which enables commands like \f(CWnext\fP and \f(CWprev\fP.
829 .LP
830 In contrast to Unix' context, which is chained to the shell session,
831 \s-1MH\s0's context is meant to be chained to a mail account.
832 But actually, the current message is a property of the mail folder,
833 which appears to be a legacy.
834 This will cause problems when multiple users work
835 in one mail folder simultaneously.
836 .PP
837 .I "Data storage.
838 How \s-1MH\s0 stores data was already mentioned.
839 Mail folders are directories (which contain a file
840 \&\f(CW.mh_sequences\fP) under the user's \s-1MH\s0 directory
841 (usually \f(CW$HOME/Mail\fP).
842 Mail messages are text files located in mail folders.
843 The files contain the messages as they were received.
844 The messages are numbered in ascending order in each folder.
845 This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0.
846 Alternatives are \fImbox\fP and \fImaildir\fP.
847 In the mbox format all messages are stored within one file.
848 This was a good solution in the early days, when messages
849 were only a few lines of text and were deleted soon.
850 Today, when single messages often include several megabytes
851 of attachments, it is a bad solution.
852 Another disadvantage of the mbox format is that it is
853 more difficult to write tools that work on mail messages,
854 because it is always necessary to first find and extract
855 the relevant message in the mbox file.
856 With the \s-1MH\s0 mailbox format,
857 each message is a self-standing item, by definition.
858 Also, the problem of concurrent access to one mailbox is
859 reduced to the problem of concurrent access to one message.
860 However, the issue of the shared parts of the context,
861 as mentioned above, remains.
862 Maildir is generally similar to \s-1MH\s0's format,
863 but modified towards guaranteed reliability.
864 This involves some complexity, unfortunately.
867 .NH 2
868 Discussion of the design
869 .LP
870 The following paragraphs discuss \s-1MH\s0 in regard to the tenets
871 of the Unix Philosophy which Gancarz identified.
873 .PP
874 .I "``Small is beautiful''
875 and
876 .I "``do one thing well''
877 are two design goals that are directly visible in \s-1MH\s0.
878 Gancarz actually presents \s-1MH\s0 as example under the headline
879 ``Making UNIX Do One Thing Well'':
880 .QP
881 [\s-1MH\s0] consists of a series of programs which
882 when combined give the user an enormous ability
883 to manipulate electronic mail messages.
884 A complex application, it shows that not only is it
885 possible to build large applications from smaller
886 components, but also that such designs are actually preferable.
887 .[
888 %A Mike Gancarz
889 %T unix-phil
890 %P 125
891 .]
892 .LP
893 The various small programs of \s-1MH\s0 were relatively easy
894 to write, because each of them is small, limited to one function,
895 and has clear boundaries.
896 For the same reasons, they are also good to maintain.
897 Further more, the system can easily get extended.
898 One only needs to put a new program into the toolchest.
899 This was done, for instance, when \s-1MIME\s0 support was added
900 (e.g. \f(CWmhbuild\fP).
901 Also, different programs can exist to do the basically same job
902 in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP).
903 If someone needs a mail system with some additionally
904 functions that are available nowhere yet, he best takes a
905 toolchest system like \s-1MH\s0 where he can add the
906 functionality with little work.
908 .PP
909 .I "Store data in flat text files.
910 FIXME
912 .PP
913 .I "``Avoid captive user interfaces.''
914 \s-1MH\s0 is perfectly suited for non-interactive use.
915 It offers all functions directly and without captive user interfaces.
916 If, nonetheless, users want a graphical user interface,
917 they can have it with \fIxmh\fP or \fIexmh\fP, too.
918 These are graphical frontends for the \s-1MH\s0 toolchest.
919 This means, all email-related work is still done by \s-1MH\s0 tools,
920 but the frontend issues the appropriate calls when the user
921 clicks on buttons.
922 Providing easy-to-use user interfaces in form of frontends is a good
923 approach, because it does not limit the power of the backend itself.
924 The frontend will anyway only be able to make a subset of the
925 backend's power and flexibility available to the user.
926 But if it is a separate program,
927 then the missing parts can still be accessed at the backend directly.
928 If it is integrated, then this will hardly be possible.
929 Further more, it is possible to have different frontends to the same
930 backend.
932 .PP
933 .I "``Choose portability over efficiency''
934 and
935 .I "``use shell scripts to increase leverage and portability'' .
936 These two tenets are indirectly, but nicely, demonstrated by
937 Bolsky and Korn in their book about the Korn Shell.
938 .[
939 %T The KornShell: command and programming language
940 %A Morris I. Bolsky
941 %A David G. Korn
942 %I Prentice Hall
943 %D 1989
944 %P 254\(en290
945 %O \s-1ISBN\s0: 0-13-516972-0
946 .]
947 They demonstrated, in chapter 18 of the book, a basic implementation
948 of a subset of \s-1MH\s0 in ksh scripts.
949 Of course, this was just a demonstration, but a brilliant one.
950 It shows how quickly one can implement such a prototype with shell scripts,
951 and how readable they are.
952 The implementation in the scripting language may not be very fast,
953 but it can be fast enough though, and this is all that matters.
954 By having the code in an interpreted language, like the shell,
955 portability becomes a minor issue, if we assume the interpreter
956 to be widespread.
957 This demonstration also shows how easy it is to create single programs
958 of a toolchest software.
959 There are eight tools (two of them have multiple names) and 16 functions
960 with supporting code.
961 Each tool comprises between 12 and 38 lines of ksh,
962 in total about 200 lines.
963 The functions comprise between 3 and 78 lines of ksh,
964 in total about 450 lines.
965 Such small software is easy to write, easy to understand,
966 and thus easy to maintain.
967 A toolchest improves the possibility to only write some parts
968 and though create a working result.
969 Expanding the toolchest without global changes will likely be
970 possible, too.
972 .PP
973 .I "``Use software leverage to your advantage''
974 and the lesser tenet
975 .I "``allow the user to tailor the environment''
976 are ideally followed in the design of \s-1MH\s0.
977 Tailoring the environment is heavily encouraged by the ability to
978 directly define default options to programs.
979 It is even possible to define different default options
980 depending on the name under which the program was called.
981 Software leverage is heavily encouraged by the ease it is to
982 create shell scripts that run a specific command line,
983 built of several \s-1MH\s0 programs.
984 There is few software that so much wants users to tailor their
985 environment and to leverage the use of the software, like \s-1MH\s0.
986 Just to make one example:
987 One might prefer a different listing format for the \f(CWscan\fP
988 program.
989 It is possible to take one of the distributed format files
990 or to write one yourself.
991 To use the format as default for \f(CWscan\fP, a single line,
992 reading
993 .DS
994 .CW
995 scan: -form FORMATFILE
996 .DE
997 must be added to \f(CW.mh_profile\fP.
998 If one wants this different format as an additional command,
999 instead of changing the default, he needs to create a link to
1000 \f(CWscan\fP, for instance titled \f(CWscan2\fP.
1001 The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP,
1002 as the option should only be in effect when scan is called as
1003 \f(CWscan2\fP.
1005 .PP
1006 .I "``Make every program a filter''
1007 is hard to find in \s-1MH\s0.
1008 The reason therefore is that most of \s-1MH\s0's tools provide
1009 basic file system operations for the mailboxes.
1010 The reason is the same because of which
1011 \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP, and \f(CWrm\fP
1012 aren't filters neither.
1013 However, they build a basis on which filters can operate.
1014 \s-1MH\s0 does not provide many filters itself, but it is a basis
1015 to write filters for.
1016 An example would be a mail message text highlighter,
1017 that means a program that makes use of a color terminal to display
1018 header lines, quotations, and signatures in distinct colors.
1019 The author's version of this program, for instance,
1020 is a 25 line awk script.
1022 .PP
1023 .I "``Build a prototype as soon as possible''
1024 was again well followed by \s-1MH\s0.
1025 This tenet, of course, focuses on early development, which is
1026 long time ago for \s-1MH\s0.
1027 But without following this guideline at the very beginning,
1028 Bruce Borden may have not convinced the management of \s-1RAND\s0
1029 to ever create \s-1MH\s0.
1030 In Bruce' own words:
1031 .QP
1032 [...] but they [Stockton Gaines and Norm Shapiro] were not able
1033 to convince anyone that such a system would be fast enough to be usable.
1034 I proposed a very short project to prove the basic concepts,
1035 and my management agreed.
1036 Looking back, I realize that I had been very lucky with my first design.
1037 Without nearly enough design work,
1038 I built a working environment and some header files
1039 with key structures and wrote the first few \s-1MH\s0 commands:
1040 inc, show/next/prev, and comp.
1041 [...]
1042 With these three, I was able to convince people that the structure was viable.
1043 This took about three weeks.
1044 .[
1045 %O FIXME
1046 .]
1048 .NH 2
1049 Problems
1050 .LP
1051 \s-1MH\s0, for sure is not without problems.
1052 There are two main problems: one is technical, the other is about human behavior.
1053 .PP
1054 \s-1MH\s0 is old and email today is very different to email in the time
1055 when \s-1MH\s0 was designed.
1056 \s-1MH\s0 adopted to the changes pretty well, but it is limited.
1057 For example in development resources.
1058 \s-1MIME\s0 support and support for different character encodings
1059 is available, but only on a moderate level.
1060 More active developers could quickly improve there.
1061 It is also limited by design, which is the larger problem.
1062 \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extend.
1063 These design conflicts are not easily solvable.
1064 Possibly, they require a redesign.
1065 Maybe \s-1IMAP\s0 is too different to the classic mail model which \s-1MH\s0 covers,
1066 hence \s-1MH\s0 may never work well with \s-1IMAP\s0.
1067 .PP
1068 The other kind of problem is human habits.
1069 When in this world almost all \s-1MUA\s0s are monolithic,
1070 it is very difficult to convince people to use a toolbox style \s-1MUA\s0
1071 like \s-1MH\s0.
1072 The habits are so strong, that even people who understood the concept
1073 and advantages of \s-1MH\s0 do not like to switch,
1074 simply because \s-1MH\s0 is different.
1075 Unfortunately, the frontends to \s-1MH\s0, which could provide familiar look'n'feel,
1076 are quite outdated and thus not very appealing compared to the modern interfaces
1077 which monolithic \s-1MUA\s0s offer.
1079 .NH 2
1080 Summary \s-1MH\s0
1081 .LP
1082 flexibility, no redundancy, use the shell
1086 .NH 1
1087 Case study: uzbl
1089 .NH 2
1090 History
1091 .LP
1092 uzbl is young
1094 .NH 2
1095 Contrasts to similar sw
1096 .LP
1097 like with nmh
1098 .LP
1099 addons, plugins, modules
1101 .NH 2
1102 Gains of the design
1103 .LP
1105 .NH 2
1106 Problems
1107 .LP
1108 broken web
1112 .NH 1
1113 Final thoughts
1115 .NH 2
1116 Quick summary
1117 .LP
1118 good design
1119 .LP
1120 unix phil
1121 .LP
1122 case studies
1124 .NH 2
1125 Why people should choose
1126 .LP
1127 Make the right choice!
1129 .nr PI .5i
1130 .rm ]<
1131 .de ]<
1132 .LP
1133 .de FP
1134 .IP \\\\$1.
1135 \\..
1136 .rm FS FE
1137 ..
1138 .SH
1139 References
1140 .[
1141 $LIST$
1142 .]
1143 .wh -1p