docs/unix-phil

view unix-phil.ms @ 34:0b2cf026d93d

rework of MH's data storage section
author meillo@marmaro.de
date Thu, 25 Mar 2010 22:18:55 +0100
parents 0bd43c4ad9f8
children f11406a85319
line source
1 .\".if n .pl 1000i
2 .\".nr PS 11
3 .\".nr VS 13
4 .de XX
5 .pl 1v
6 ..
7 .em XX
8 .\".nr PI 0
9 .\".if t .nr PD .5v
10 .\".if n .nr PD 1v
11 .nr lu 0
12 .de CW
13 .nr PQ \\n(.f
14 .if t .ft CW
15 .ie ^\\$1^^ .if n .ul 999
16 .el .if n .ul 1
17 .if t .if !^\\$1^^ \&\\$1\f\\n(PQ\\$2
18 .if n .if \\n(.$=1 \&\\$1
19 .if n .if \\n(.$>1 \&\\$1\c
20 .if n .if \\n(.$>1 \&\\$2
21 ..
22 .ds [. \ [
23 .ds .] ]
24 .\"----------------------------------------
25 .TL
26 Why the Unix Philosophy still matters
27 .AU
28 markus schnalke <meillo@marmaro.de>
29 .AB
30 .ti \n(.iu
31 This paper discusses the importance of the Unix Philosophy in software design.
32 Today, few software designers are aware of these concepts,
33 and thus most modern software is limited and does not make use of software leverage.
34 Knowing and following the tenets of the Unix Philosophy makes software more valuable.
35 .AE
37 .\".if t .2C
39 .FS
40 .ps -1
41 This paper was prepared for the seminar ``Software Analysis'' at University Ulm.
42 Mentor was professor Schweiggert. 2010-02-05
43 .br
44 You may get this document from my website
45 .CW \s-1http://marmaro.de/docs
46 .FE
48 .NH 1
49 Introduction
50 .LP
51 Building a software is a process from an idea of the purpose of the software
52 to its release.
53 No matter \fIhow\fP the process is run, two things are common:
54 the initial idea and the release.
55 The process in between can be of any shape.
56 The the maintenance work after the release is ignored for the moment.
57 .PP
58 The process of building splits mainly in two parts:
59 the planning of what and how to build, and implementing the plan by writing code.
60 This paper focuses on the planning part \(en the designing of the software.
61 .PP
62 Software design is the plan of how the internals and externals of the software should look like,
63 based on the requirements.
64 This paper discusses the recommendations of the Unix Philosophy about software design.
65 .PP
66 The here discussed ideas can get applied by any development process.
67 The Unix Philosophy does recommend how the software development process should look like,
68 but this shall not be of matter here.
69 Similar, the question of how to write the code is out of focus.
70 .PP
71 The name ``Unix Philosophy'' was already mentioned several times, but it was not explained yet.
72 The Unix Philosophy is the essence of how the Unix operating system and its toolchest was designed.
73 It is no limited set of rules, but what people see to be common to typical Unix software.
74 Several people stated their view on the Unix Philosophy.
75 Best known are:
76 .IP \(bu
77 Doug McIlroy's summary: ``Write programs that do one thing and do it well.''
78 .[
79 %A M. D. McIlroy
80 %A E. N. Pinson
81 %A B. A. Taque
82 %T UNIX Time-Sharing System Forward
83 %J The Bell System Technical Journal
84 %D 1978
85 %V 57
86 %N 6
87 %P 1902
88 .]
89 .IP \(bu
90 Mike Gancarz' book ``The UNIX Philosophy''.
91 .[
92 %A Mike Gancarz
93 %T The UNIX Philosophy
94 %D 1995
95 %I Digital Press
96 .]
97 .IP \(bu
98 Eric S. Raymond's book ``The Art of UNIX Programming''.
99 .[
100 %A Eric S. Raymond
101 %T The Art of UNIX Programming
102 %D 2003
103 %I Addison-Wesley
104 %O .CW \s-1http://www.faqs.org/docs/artu/
105 .]
106 .LP
107 These different views on the Unix Philosophy have much in common.
108 Especially, the main concepts are similar for all of them.
109 But there are also points on which they differ.
110 This only underlines what the Unix Philosophy is:
111 A retrospective view on the main concepts of Unix software;
112 especially those that were successful and unique to Unix.
113 .\" really?
114 .PP
115 Before we will have a look at concrete concepts,
116 we discuss why software design is important
117 and what problems bad design introduces.
120 .NH 1
121 Importance of software design in general
122 .LP
123 Why should we design software at all?
124 It is general knowledge, that even a bad plan is better than no plan.
125 Ignoring software design is programming without a plan.
126 This will lead pretty sure to horrible results.
127 .PP
128 The design of a software is its internal and external shape.
129 The design talked about here has nothing to do with visual appearance.
130 If we see a program as a car, then its color is of no matter.
131 Its design would be the car's size, its shape, the number and position of doors,
132 the ratio of passenger and cargo transport, and so forth.
133 .PP
134 A software's design is about quality properties.
135 Each of the cars may be able to drive from A to B,
136 but it depends on its properties whether it is a good car for passenger transport or not.
137 It also depends on its properties if it is a good choice for a rough mountain area.
138 .PP
139 Requirements to a software are twofold: functional and non-functional.
140 Functional requirements are easier to define and to verify.
141 They are directly the software's functions.
142 Functional requirements are the reason why software gets written.
143 Someone has a problem and needs a tool to solve it.
144 Being able to solve the problem is the main functional requirement.
145 It is the driving force behind all programming effort.
146 .PP
147 On the other hand, there are also non-functional requirements.
148 They are called \fIquality\fP requirements, too.
149 The quality of a software is about properties that are not directly related to
150 the software's basic functions.
151 Quality aspects are about the properties that are overlooked at first sight.
152 .PP
153 Quality is of few matter when the software gets initially built,
154 but it will be of matter in usage and maintenance of the software.
155 A short-sighted might see in developing a software mainly building something up.
156 Reality shows, that building the software the first time is only a small amount
157 of the overall work.
158 Bug fixing, extending, rebuilding of parts \(en short: maintenance work \(en
159 does soon take over the major part of the time spent on a software.
160 Not to forget the usage of the software.
161 These processes are highly influenced by the software's quality.
162 Thus, quality should never be neglected.
163 The problem is that you hardly ``stumble over'' bad quality during the first build,
164 but this is the time when you should care about good quality most.
165 .PP
166 Software design is not about the basic function of a software;
167 this requirement will get satisfied anyway, as it is the main driving force behind the development.
168 Software design is about quality aspects of the software.
169 Good design will lead to good quality, bad design to bad quality.
170 The primary functions of the software will be affected modestly by bad quality,
171 but good quality can provide a lot of additional gain from the software,
172 even at places where one never expected it.
173 .PP
174 The ISO/IEC 9126-1 standard, part 1,
175 .[
176 %I International Organization for Standardization
177 %T ISO Standard 9126: Software Engineering \(en Product Quality, part 1
178 %C Geneve
179 %D 2001
180 .]
181 defines the quality model as consisting out of:
182 .IP \(bu
183 .I Functionality
184 (suitability, accuracy, inter\%operability, security)
185 .IP \(bu
186 .I Reliability
187 (maturity, fault tolerance, recoverability)
188 .IP \(bu
189 .I Usability
190 (understandability, learnability, operability, attractiveness)
191 .IP \(bu
192 .I Efficiency
193 (time behavior, resource utilization)
194 .IP \(bu
195 .I Maintainability
196 (analyzability, changeability, stability, testability)
197 .IP \(bu
198 .I Portability
199 (adaptability, installability, co-existence, replaceability)
200 .LP
201 These goals are parts of a software's design.
202 Good design can give these properties to a software,
203 bad designed software will miss them.
204 .PP
205 One further goal of software design is consistency.
206 Consistency eases understanding, working on, and using things.
207 Consistent internals and consistent interfaces to the outside can be provided by good design.
208 .PP
209 We should design software because good design avoids many problems during a software's lifetime.
210 And we should design software because good design can offer much gain,
211 that can be unrelated to the software main intend.
212 Indeed, we should spend much effort into good design to make the software more valuable.
213 The Unix Philosophy shows how to design software well.
214 It offers guidelines to achieve good quality and high gain for the effort spent.
217 .NH 1
218 The Unix Philosophy
219 .LP
220 The origins of the Unix Philosophy were already introduced.
221 This chapter explains the philosophy, oriented on Gancarz,
222 and shows concrete examples of its application.
224 .NH 2
225 Pipes
226 .LP
227 Following are some examples to demonstrate how applied Unix Philosophy feels like.
228 Knowledge of using the Unix shell is assumed.
229 .PP
230 Counting the number of files in the current directory:
231 .DS I 2n
232 .CW
233 .ps -1
234 ls | wc -l
235 .DE
236 The
237 .CW ls
238 command lists all files in the current directory, one per line,
239 and
240 .CW "wc -l
241 counts the number of lines.
242 .PP
243 Counting the number of files that do not contain ``foo'' in their name:
244 .DS I 2n
245 .CW
246 .ps -1
247 ls | grep -v foo | wc -l
248 .DE
249 Here, the list of files is filtered by
250 .CW grep
251 to remove all that contain ``foo''.
252 The rest is the same as in the previous example.
253 .PP
254 Finding the five largest entries in the current directory.
255 .DS I 2n
256 .CW
257 .ps -1
258 du -s * | sort -nr | sed 5q
259 .DE
260 .CW "du -s *
261 returns the recursively summed sizes of all files
262 \(en no matter if they are regular files or directories.
263 .CW "sort -nr
264 sorts the list numerically in reverse order.
265 Finally,
266 .CW "sed 5q
267 quits after it has printed the fifth line.
268 .PP
269 The presented command lines are examples of what Unix people would use
270 to get the desired output.
271 There are also other ways to get the same output.
272 It's a user's decision which way to go.
273 .PP
274 The examples show that many tasks on a Unix system
275 are accomplished by combining several small programs.
276 The connection between the single programs is denoted by the pipe operator `|'.
277 .PP
278 Pipes, and their extensive and easy use, are one of the great
279 achievements of the Unix system.
280 Pipes between programs have been possible in earlier operating systems,
281 but it has never been a so central part of the concept.
282 When, in the early seventies, Doug McIlroy introduced pipes for the
283 Unix system,
284 ``it was this concept and notation for linking several programs together
285 that transformed Unix from a basic file-sharing system to an entirely new way of computing.''
286 .[
287 %T Unix: An Oral History
288 %O .CW \s-1http://www.princeton.edu/~hos/frs122/unixhist/finalhis.htm
289 .]
290 .PP
291 Being able to specify pipelines in an easy way is,
292 however, not enough by itself.
293 It is only one half.
294 The other is the design of the programs that are used in the pipeline.
295 They have to interfaces that allows them to be used in such a way.
297 .NH 2
298 Interface design
299 .LP
300 Unix is, first of all, simple \(en Everything is a file.
301 Files are sequences of bytes, without any special structure.
302 Programs should be filters, which read a stream of bytes from ``standard input'' (stdin)
303 and write a stream of bytes to ``standard output'' (stdout).
304 .PP
305 If the files \fIare\fP sequences of bytes,
306 and the programs \fIare\fP filters on byte streams,
307 then there is exactly one standardized data interface.
308 Thus it is possible to combine them in any desired way.
309 .PP
310 Even a handful of small programs will yield a large set of combinations,
311 and thus a large set of different functions.
312 This is leverage!
313 If the programs are orthogonal to each other \(en the best case \(en
314 then the set of different functions is greatest.
315 .PP
316 Programs might also have a separate control interface,
317 besides their data interface.
318 The control interface is often called ``user interface'',
319 because it is usually designed to be used by humans.
320 The Unix Philosophy discourages to assume the user to be human.
321 Interactive use of software is slow use of software,
322 because the program waits for user input most of the time.
323 Interactive software requires the user to be in front of the computer
324 all the time.
325 Interactive software occupy the user's attention while they are running.
326 .PP
327 Now we come back to the idea of using several small programs, combined,
328 to have a more specific function.
329 If these single tools would all be interactive,
330 how would the user control them?
331 It is not only a problem to control several programs at once if they run at the same time,
332 it also very inefficient to have to control each of the single programs
333 that are intended to work as one large program.
334 Hence, the Unix Philosophy discourages programs to demand interactive use.
335 The behavior of programs should be defined at invocation.
336 This is done by specifying arguments (``command line switches'') to the program call.
337 Gancarz discusses this topic as ``avoid captive user interfaces''.
338 .[
339 %A Mike Gancarz
340 %T The UNIX Philosophy
341 %I Digital Press
342 %D 1995
343 %P 88 ff.
344 .]
345 .PP
346 Non-interactive use is, during development, also an advantage for testing.
347 Testing of interactive programs is much more complicated,
348 than testing of non-interactive programs.
350 .NH 2
351 The toolchest approach
352 .LP
353 A toolchest is a set of tools.
354 Instead of having one big tool for all tasks, one has many small tools,
355 each for one task.
356 Difficult tasks are solved by combining several of the small, simple tools.
357 .PP
358 The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs
359 that are filters on byte streams.
360 They are, to a large extend, unrelated in their function.
361 Hence, the Unix toolchest provides a large set of functions
362 that can be accessed by combining the programs in the desired way.
363 .PP
364 There are also advantages for developing small toolchest programs.
365 It is easier and less error-prone to write small programs.
366 It is also easier and less error-prone to write a large set of small programs,
367 than to write one large program with all the functionality included.
368 If the small programs are combinable, then they offer even a larger set
369 of functions than the single large program.
370 Hence, one gets two advantages out of writing small, combinable programs.
371 .PP
372 There are two drawbacks of the toolchest approach.
373 First, one simple, standardized, unidirectional interface has to be sufficient.
374 If one feels the need for more ``logic'' than a stream of bytes,
375 then a different approach might be of need.
376 But it is also possible, that he just can not imagine a design where
377 a stream of bytes is sufficient.
378 By becoming more familiar with the ``Unix style of thinking'',
379 developers will more often and easier find simple designs where
380 a stream of bytes is a sufficient interface.
381 .PP
382 The second drawback of a toolchest affects the users.
383 A toolchest is often more difficult to use for novices.
384 It is necessary to become familiar with each of the tools,
385 to be able to use the right one in a given situation.
386 Additionally, one needs to combine the tools in a senseful way on its own.
387 This is like a sharp knife \(en it is a powerful tool in the hand of a master,
388 but of no good value in the hand of an unskilled.
389 .PP
390 However, learning single, small tool of the toolchest is easier than
391 learning a complex tool.
392 The user will have a basic understanding of a yet unknown tool,
393 if the several tools of the toolchest have a common style.
394 He will be able to transfer knowledge over one tool to another.
395 .PP
396 Moreover, the second drawback can be removed easily by adding wrappers
397 around the single tools.
398 Novice users do not need to learn several tools if a professional wraps
399 the single commands into a more high-level script.
400 Note that the wrapper script still calls the small tools;
401 the wrapper script is just like a skin around.
402 No complexity is added this way,
403 but new programs can get created out of existing one with very low effort.
404 .PP
405 A wrapper script for finding the five largest entries in the current directory
406 could look like this:
407 .DS I 2n
408 .CW
409 .ps -1
410 #!/bin/sh
411 du -s * | sort -nr | sed 5q
412 .DE
413 The script itself is just a text file that calls the command line
414 a professional user would type in directly.
415 Making the program flexible on the number of entries it prints,
416 is easily possible:
417 .DS I 2n
418 .CW
419 .ps -1
420 #!/bin/sh
421 num=5
422 [ $# -eq 1 ] && num="$1"
423 du -sh * | sort -nr | sed "${num}q"
424 .DE
425 This script acts like the one before, when called without an argument.
426 But one can also specify a numerical argument to define the number of lines to print.
428 .NH 2
429 A powerful shell
430 .LP
431 It was already said, that the Unix shell provides the possibility to
432 combine small programs into large ones easily.
433 A powerful shell is a great feature in other ways, too.
434 .PP
435 For instance by including a scripting language.
436 The control statements are build into the shell.
437 The functions, however, are the normal programs, everyone can use on the system.
438 Thus, the programs are known, so learning to program in the shell is easy.
439 Using normal programs as functions in the shell programming language
440 is only possible because they are small and combinable tools in a toolchest style.
441 .PP
442 The Unix shell encourages to write small scripts out of other programs,
443 because it is so easy to do.
444 This is a great step towards automation.
445 It is wonderful if the effort to automate a task equals the effort
446 it takes to do it the second time by hand.
447 If it is so, then the user will be happy to automate everything he does more than once.
448 .PP
449 Small programs that do one job well, standardized interfaces between them,
450 a mechanism to combine parts to larger parts, and an easy way to automate tasks,
451 this will inevitably produce software leverage.
452 Getting multiple times the benefit of an investment is a great offer.
453 .PP
454 The shell also encourages rapid prototyping.
455 Many well known programs started as quickly hacked shell scripts,
456 and turned into ``real'' programs, written in C, later.
457 Building a prototype first is a way to avoid the biggest problems
458 in application development.
459 Fred Brooks writes in ``No Silver Bullet'':
460 .[
461 %A Frederick P. Brooks, Jr.
462 %T No Silver Bullet: Essence and Accidents of Software Engineering
463 %B Information Processing 1986, the Proceedings of the IFIP Tenth World Computing Conference
464 %E H.-J. Kugler
465 %D 1986
466 %P 1069\(en1076
467 %I Elsevier Science B.V.
468 %C Amsterdam, The Netherlands
469 .]
470 .QP
471 The hardest single part of building a software system is deciding precisely what to build.
472 No other part of the conceptual work is so difficult as establishing the detailed
473 technical requirements, [...].
474 No other part of the work so cripples the resulting system if done wrong.
475 No other part is more difficult to rectify later.
476 .PP
477 Writing a prototype is a great method to become familiar with the requirements
478 and to actually run into real problems.
479 Today, prototyping is often seen as a first step in building a software.
480 This is, of course, good.
481 However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping:
482 After having built the prototype, one might notice, that the prototype is already
483 \fIgood enough\fP.
484 Hence, no reimplementation, in a more sophisticated programming language, might be of need,
485 for the moment.
486 Maybe later, it might be necessary to rewrite the software, but not now.
487 .PP
488 By delaying further work, one keeps the flexibility to react easily on
489 changing requirements.
490 Software parts that are not written will not miss the requirements.
492 .NH 2
493 Worse is better
494 .LP
495 The Unix Philosophy aims for the 80% solution;
496 others call it the ``Worse is better'' approach.
497 .PP
498 First, practical experience shows, that it is almost never possible to define the
499 requirements completely and correctly the first time.
500 Hence one should not try to; it will fail anyway.
501 Second, practical experience shows, that requirements change during time.
502 Hence it is best to delay requirement-based design decisions as long as possible.
503 Also, the software should be small and flexible as long as possible
504 to react on changing requirements.
505 Shell scripts, for example, are more easily adjusted as C programs.
506 Third, practical experience shows, that maintenance is hard work.
507 Hence, one should keep the amount of software as small as possible;
508 it should just fulfill the \fIcurrent\fP requirements.
509 Software parts that will be written later, do not need maintenance now.
510 .PP
511 Starting with a prototype in a scripting language has several advantages:
512 .IP \(bu
513 As the initial effort is low, one will likely start right away.
514 .IP \(bu
515 As working parts are available soon, the real requirements can get identified soon.
516 .IP \(bu
517 When a software is usable, it gets used, and thus tested.
518 Hence problems will be found at early stages of the development.
519 .IP \(bu
520 The prototype might be enough for the moment,
521 thus further work on the software can be delayed to a time
522 when one knows better about the requirements and problems,
523 than now.
524 .IP \(bu
525 Implementing now only the parts that are actually needed now,
526 requires fewer maintenance work.
527 .IP \(bu
528 If the global situation changes so that the software is not needed anymore,
529 then less effort was spent into the project, than it would have be
530 when a different approach had been used.
532 .NH 2
533 Upgrowth and survival of software
534 .LP
535 So far it was talked about \fIwriting\fP or \fIbuilding\fP software.
536 Although these are just verbs, they do imply a specific view on the work process
537 they describe.
538 The better verb, however, is to \fIgrow\fP.
539 .PP
540 Creating software in the sense of the Unix Philosophy is an incremental process.
541 It starts with a first prototype, which evolves as requirements change.
542 A quickly hacked shell script might become a large, sophisticated,
543 compiled program this way.
544 Its lifetime begins with the initial prototype and ends when the software is not used anymore.
545 While being alive it will get extended, rearranged, rebuilt (from scratch).
546 Growing software matches the view that ``software is never finished. It is only released.''
547 .[
548 %O FIXME
549 %A Mike Gancarz
550 %T The UNIX Philosophy
551 %P 26
552 .]
553 .PP
554 Software can be seen as being controlled by evolutionary processes.
555 Successful software is software that is used by many for a long time.
556 This implies that the software is needed, useful, and better than alternatives.
557 Darwin talks about: ``The survival of the fittest.''
558 .[
559 %O FIXME
560 %A Charles Darwin
561 .]
562 Transferred to software: The most successful software, is the fittest,
563 is the one that survives.
564 (This may be at the level of one creature, or at the level of one species.)
565 The fitness of software is affected mainly by four properties:
566 portability of code, portability of data, range of usability, and reusability of parts.
567 .\" .IP \(bu
568 .\" portability of code
569 .\" .IP \(bu
570 .\" portability of data
571 .\" .IP \(bu
572 .\" range of usability
573 .\" .IP \(bu
574 .\" reuseability of parts
575 .PP
576 (1)
577 .I "Portability of code
578 means, using high-level programming languages,
579 sticking to the standard,
580 and avoiding optimizations that introduce dependencies on specific hardware.
581 Hardware has a much lower lifetime than software.
582 By chaining software to a specific hardware,
583 the software's lifetime gets shortened to that of this hardware.
584 In contrast, software should be easy to port \(en
585 adaptation is the key to success.
586 .\" cf. practice of prog: ch08
587 .PP
588 (2)
589 .I "Portability of data
590 is best achieved by avoiding binary representations
591 to store data, because binary representations differ from machine to machine.
592 Textual representation is favored.
593 Historically, ASCII was the charset of choice.
594 In the future, UTF-8 might be the better choice, however.
595 Important is that it is a plain text representation in a
596 very common charset encoding.
597 Apart from being able to transfer data between machines,
598 readable data has the great advantage, that humans are able
599 to directly edit it with text editors and other tools from the Unix toolchest.
600 .\" gancarz tenet 5
601 .PP
602 (3)
603 A large
604 .I "range of usability
605 ensures good adaptation, and thus good survival.
606 It is a special distinction if a software becomes used in fields of action,
607 the original authors did never imagine.
608 Software that solves problems in a general way will likely be used
609 for all kinds of similar problems.
610 Being too specific limits the range of uses.
611 Requirements change through time, thus use cases change or even vanish.
612 A good example in this point is Allman's sendmail.
613 Allman identifies flexibility to be one major reason for sendmail's success:
614 .[
615 %O FIXME
616 %A Allman
617 %T sendmail
618 .]
619 .QP
620 Second, I limited myself to the routing function [...].
621 This was a departure from the dominant thought of the time, [...].
622 .QP
623 Third, the sendmail configuration file was flexible enough to adopt
624 to a rapidly changing world [...].
625 .LP
626 Successful software adopts itself to the changing world.
627 .PP
628 (4)
629 .I "Reuse of parts
630 is even one step further.
631 A software may completely lose its field of action,
632 but parts of which the software is build may be general and independent enough
633 to survive this death.
634 If software is build by combining small independent programs,
635 then there are parts readily available for reuse.
636 Who cares if the large program is a failure,
637 but parts of it become successful instead?
639 .NH 2
640 Summary
641 .LP
642 This chapter explained the central ideas of the Unix Philosophy.
643 For each of the ideas, it was exposed what advantages they introduce.
644 The Unix Philosophy are guidelines that help to write valuable software.
645 From the view point of a software developer or software designer,
646 the Unix Philosophy provides answers to many software design problem.
647 .PP
648 The various ideas of the Unix Philosophy are very interweaved
649 and can hardly be applied independently.
650 However, the probably most important messages are:
651 .I "``Do one thing well!''" ,
652 .I "``Keep it simple!''" ,
653 and
654 .I "``Use software leverage!''
658 .NH 1
659 Case study: \s-1MH\s0
660 .LP
661 The previous chapter introduced and explained the Unix Philosophy
662 from a general point of view.
663 The driving force were the guidelines; references to
664 existing software were given only sparsely.
665 In this and the next chapter, concrete software will be
666 the driving force in the discussion.
667 .PP
668 This first case study is about the mail user agents (\s-1MUA\s0)
669 \s-1MH\s0 (``mail handler'') and its descendent \fInmh\fP
670 (``new mail handler'').
671 \s-1MUA\s0s provide functions to read, compose, and organize mail,
672 but (ideally) not to transfer.
673 In this document, the name \s-1MH\s0 will be used for both of them.
674 A distinction will only be made if differences between
675 them are described.
678 .NH 2
679 Historical background
680 .LP
681 Electronic mail was available in Unix very early.
682 The first \s-1MUA\s0 on Unix was \f(CWmail\fP,
683 which was already present in the First Edition.
684 .[
685 %A Peter H. Salus
686 %T A Quarter Century of UNIX
687 %D 1994
688 %I Addison-Wesley
689 %P 41 f.
690 .]
691 It was a small program that either prints the user's mailbox file
692 or appends text to someone elses mailbox file,
693 depending on the command line arguments.
694 .[
695 %O http://cm.bell-labs.com/cm/cs/who/dmr/pdfs/man12.pdf
696 .]
697 It was a program that did one job well.
698 This job was emailing, which was very simple then.
699 .PP
700 Later, emailing became more powerful, and thus more complex.
701 The simple \f(CWmail\fP, which knew nothing of subjects,
702 independent handling of single messages,
703 and long-time storage of them, was not powerful enough anymore.
704 At Berkeley, Kurt Shoens wrote \fIMail\fP (with capital `M')
705 in 1978 to provide additional functions for emailing.
706 Mail was still one program, but now it was large and did
707 several jobs.
708 Its user interface is modeled after the one of \fIed\fP.
709 It is designed for humans, but is still scriptable.
710 \fImailx\fP is the adaptation of Berkeley Mail into System V.
711 .[
712 %A Gunnar Ritter
713 %O http://heirloom.sourceforge.net/mailx_history.html
714 .]
715 Elm, pine, mutt, and a whole bunch of graphical \s-1MUA\s0s
716 followed Mail's direction.
717 They are large, monolithic programs which include all emailing functions.
718 .PP
719 A different way was taken by the people of \s-1RAND\s0 Corporation.
720 In the beginning, they also had used a monolitic mail system,
721 called \s-1MS\s0 (for ``mail system'').
722 But in 1977, Stockton Gaines and Norman Shapiro
723 came up with a proposal of a new email system concept \(en
724 one that honors the Unix Philosophy.
725 The concept was implemented by Bruce Borden in 1978 and 1979.
726 This was the birth of \s-1MH\s0 \(en the ``mail handler''.
727 .PP
728 Since then, \s-1RAND\s0, the University of California at Irvine and
729 at Berkeley, and several others have contributed to the software.
730 However, it's core concepts remained the same.
731 In the late 90s, when development of \s-1MH\s0 slowed down,
732 Richard Coleman started with \fInmh\fP, the new mail handler.
733 His goal was to improve \s-1MH\s0, especially in regard of
734 the requirements of modern emailing.
735 Today, nmh is developed by various people on the Internet.
736 .[
737 %T RAND and the Information Evolution: A History in Essays and Vignettes
738 %A Willis H. Ware
739 %D 2008
740 %I The RAND Corporation
741 %P 128\(en137
742 %O .CW \s-1http://www.rand.org/pubs/corporate_pubs/CP537/
743 .]
744 .[
745 %T MH & xmh: Email for Users & Programmers
746 %A Jerry Peek
747 %D 1991, 1992, 1995
748 %I O'Reilly & Associates, Inc.
749 %P Appendix B
750 %O Also available online: \f(CW\s-2http://rand-mh.sourceforge.net/book/\fP
751 .]
753 .NH 2
754 Contrasts to monolithic mail systems
755 .LP
756 All \s-1MUA\s0s are monolithic, except \s-1MH\s0.
757 Although there might acutally exist further, very little known,
758 toolchest \s-1MUA\s0s, this statement reflects the situation pretty well.
759 .PP
760 Monolithic \s-1MUA\s0s gather all their functions in one program.
761 In contrast, \s-1MH\s0 is a toolchest of many small tools \(en one for each job.
762 Following is a list of important programs of \s-1MH\s0's toolchest
763 and their function.
764 It gives a feeling of how the toolchest looks like.
765 .IP \(bu
766 .CW inc :
767 incorporate new mail (this is how mail enters the system)
768 .IP \(bu
769 .CW scan :
770 list messages in folder
771 .IP \(bu
772 .CW show :
773 show message
774 .IP \(bu
775 .CW next\fR/\fPprev :
776 show next/previous message
777 .IP \(bu
778 .CW folder :
779 change current folder
780 .IP \(bu
781 .CW refile :
782 refile message into folder
783 .IP \(bu
784 .CW rmm :
785 remove message
786 .IP \(bu
787 .CW comp :
788 compose a new message
789 .IP \(bu
790 .CW repl :
791 reply to a message
792 .IP \(bu
793 .CW forw :
794 forward a message
795 .IP \(bu
796 .CW send :
797 send a prepared message (this is how mail leaves the system)
798 .LP
799 \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have.
800 The user does not leave the shell to run \s-1MH\s0,
801 but he uses the various \s-1MH\s0 programs within the shell.
802 Using a monolithic program with a captive user interface
803 means ``entering'' the program, using it, and ``exiting'' the program.
804 Using toolchests like \s-1MH\s0 means running programs,
805 alone or in combinition with others, even from other toolchests,
806 without leaving the shell.
808 .NH 2
809 Data storage
810 .LP
811 \s-1MH\s0's mail storage is a directory tree under the user's
812 \s-1MH\s0 directory (usually \f(CW$HOME/Mail\fP),
813 where mail folders are directories and mail messages are text files
814 within them.
815 Each mail folder contains a file \f(CW.mh_sequences\fP which lists
816 the public message sequences of that folder, for instance new messages.
817 Mail messages are text files located in a mail folder.
818 The files contain the messages as they were received.
819 They are numbered in ascending order in each folder.
820 .PP
821 This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0.
822 Alternatives are \fImbox\fP and \fImaildir\fP.
823 In the mbox format all messages are stored within one file.
824 This was a good solution in the early days, when messages
825 were only a few lines of text and were deleted soon.
826 Today, when single messages often include several megabytes
827 of attachments, it is a bad solution.
828 Another disadvantage of the mbox format is that it is
829 more difficult to write tools that work on mail messages,
830 because it is always necessary to first find and extract
831 the relevant message in the mbox file.
832 With the \s-1MH\s0 mailbox format,
833 each message is a self-standing item, by definition.
834 Also, the problem of concurrent access to one mailbox is
835 reduced to the problem of concurrent access to one message.
836 Maildir is generally similar to \s-1MH\s0's format,
837 but modified towards guaranteed reliability.
838 This involves some complexity, unfortunately.
839 .PP
840 Working with \s-1MH\s0's toolchest on mailboxes is much like
841 working with Unix' toolchest on directory trees:
842 \f(CWscan\fP is like \f(CWls\fP,
843 \f(CWshow\fP is like \f(CWcat\fP,
844 \f(CWfolder\fP is like \f(CWcd\fP and \f(CWpwd\fP,
845 \f(CWrefile\fP is like \f(CWmv\fP,
846 and \f(CWrmm\fP is like \f(CWrm\fP.
847 .PP
848 The context of tools in Unix consists mainly the current working directory,
849 the user identification, and the environment variables.
850 \s-1MH\s0 extends this context by two more items:
851 .IP \(bu
852 The current mail folder, which is similar to the current working directory.
853 For mail folders, \f(CWfolder\fP provides the corresponding functionality
854 of \f(CWcd\fP and \f(CWpwd\fP for directories.
855 .IP \(bu
856 Sequences, which are named sets of messages in a mail folder.
857 The current message, relative to a mail folder, is a special sequence.
858 It enables commands like \f(CWnext\fP and \f(CWprev\fP.
859 .LP
860 In contrast to Unix' context, which is chained to the shell session,
861 \s-1MH\s0's context is independent.
862 Usually there is one context for each user, but a user can have many
863 contexts.
864 Public sequences are an exception, as they belong to the mail folder.
865 .[
866 %O mh-profile(5) and mh-sequence(5)
867 .]
869 .NH 2
870 Discussion of the design
871 .LP
872 The following paragraphs discuss \s-1MH\s0 in regard to the tenets
873 of the Unix Philosophy which Gancarz identified.
875 .PP
876 .B "Small is beautiful
877 and
878 .B "do one thing well
879 are two design goals that are directly visible in \s-1MH\s0.
880 Gancarz actually presents \s-1MH\s0 as example under the headline
881 ``Making UNIX Do One Thing Well'':
882 .QP
883 [\s-1MH\s0] consists of a series of programs which
884 when combined give the user an enormous ability
885 to manipulate electronic mail messages.
886 A complex application, it shows that not only is it
887 possible to build large applications from smaller
888 components, but also that such designs are actually preferable.
889 .[
890 %A Mike Gancarz
891 %T unix-phil
892 %P 125
893 .]
894 .LP
895 The various small programs of \s-1MH\s0 were relatively easy
896 to write, because each of them is small, limited to one function,
897 and has clear boundaries.
898 For the same reasons, they are also good to maintain.
899 Further more, the system can easily get extended.
900 One only needs to put a new program into the toolchest.
901 This was done, for instance, when \s-1MIME\s0 support was added
902 (e.g. \f(CWmhbuild\fP).
903 Also, different programs can exist to do the basically same job
904 in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP).
905 If someone needs a mail system with some additionally
906 functions that are available nowhere yet, he best takes a
907 toolchest system like \s-1MH\s0 where he can add the
908 functionality with little work.
910 .PP
911 .B "Store data in flat text files
912 is followed by \s-1MH\s0.
913 This is not surprising, because email messages are already plain text.
914 \s-1MH\s0 stores the messages as it receives them,
915 thus any other tool that works on RFC 2822 mail messages can operate
916 on the messages in an \s-1MH\s0 mailbox.
917 All other files \s-1MH\s0 uses are plain text too.
918 It is therefore possible and encouraged to use the text processing
919 tools of Unix' toolchest to extend \s-1MH\s0's toolchest.
921 .PP
922 .B "Avoid captive user interfaces" .
923 \s-1MH\s0 is perfectly suited for non-interactive use.
924 It offers all functions directly and without captive user interfaces.
925 If, nonetheless, users want a graphical user interface,
926 they can have it with \fIxmh\fP or \fIexmh\fP, too.
927 These are graphical frontends for the \s-1MH\s0 toolchest.
928 This means, all email-related work is still done by \s-1MH\s0 tools,
929 but the frontend issues the appropriate calls when the user
930 clicks on buttons.
931 Providing easy-to-use user interfaces in form of frontends is a good
932 approach, because it does not limit the power of the backend itself.
933 The frontend will anyway only be able to make a subset of the
934 backend's power and flexibility available to the user.
935 But if it is a separate program,
936 then the missing parts can still be accessed at the backend directly.
937 If it is integrated, then this will hardly be possible.
938 Further more, it is possible to have different frontends to the same
939 backend.
941 .PP
942 .B "Choose portability over efficiency
943 and
944 .B "use shell scripts to increase leverage and portability" .
945 These two tenets are indirectly, but nicely, demonstrated by
946 Bolsky and Korn in their book about the Korn Shell.
947 .[
948 %T The KornShell: command and programming language
949 %A Morris I. Bolsky
950 %A David G. Korn
951 %I Prentice Hall
952 %D 1989
953 %P 254\(en290
954 %O \s-1ISBN\s0: 0-13-516972-0
955 .]
956 They demonstrated, in chapter 18 of the book, a basic implementation
957 of a subset of \s-1MH\s0 in ksh scripts.
958 Of course, this was just a demonstration, but a brilliant one.
959 It shows how quickly one can implement such a prototype with shell scripts,
960 and how readable they are.
961 The implementation in the scripting language may not be very fast,
962 but it can be fast enough though, and this is all that matters.
963 By having the code in an interpreted language, like the shell,
964 portability becomes a minor issue, if we assume the interpreter
965 to be widespread.
966 This demonstration also shows how easy it is to create single programs
967 of a toolchest software.
968 There are eight tools (two of them have multiple names) and 16 functions
969 with supporting code.
970 Each tool comprises between 12 and 38 lines of ksh,
971 in total about 200 lines.
972 The functions comprise between 3 and 78 lines of ksh,
973 in total about 450 lines.
974 Such small software is easy to write, easy to understand,
975 and thus easy to maintain.
976 A toolchest improves the possibility to only write some parts
977 and though create a working result.
978 Expanding the toolchest without global changes will likely be
979 possible, too.
981 .PP
982 .B "Use software leverage to your advantage
983 and the lesser tenet
984 .B "allow the user to tailor the environment
985 are ideally followed in the design of \s-1MH\s0.
986 Tailoring the environment is heavily encouraged by the ability to
987 directly define default options to programs.
988 It is even possible to define different default options
989 depending on the name under which the program was called.
990 Software leverage is heavily encouraged by the ease it is to
991 create shell scripts that run a specific command line,
992 built of several \s-1MH\s0 programs.
993 There is few software that so much wants users to tailor their
994 environment and to leverage the use of the software, like \s-1MH\s0.
995 Just to make one example:
996 One might prefer a different listing format for the \f(CWscan\fP
997 program.
998 It is possible to take one of the distributed format files
999 or to write one yourself.
1000 To use the format as default for \f(CWscan\fP, a single line,
1001 reading
1002 .DS
1003 .CW
1004 scan: -form FORMATFILE
1005 .DE
1006 must be added to \f(CW.mh_profile\fP.
1007 If one wants this different format as an additional command,
1008 instead of changing the default, he needs to create a link to
1009 \f(CWscan\fP, for instance titled \f(CWscan2\fP.
1010 The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP,
1011 as the option should only be in effect when scan is called as
1012 \f(CWscan2\fP.
1014 .PP
1015 .B "Make every program a filter
1016 is hard to find in \s-1MH\s0.
1017 The reason therefore is that most of \s-1MH\s0's tools provide
1018 basic file system operations for the mailboxes.
1019 The reason is the same because of which
1020 \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP, and \f(CWrm\fP
1021 aren't filters neither.
1022 However, they build a basis on which filters can operate.
1023 \s-1MH\s0 does not provide many filters itself, but it is a basis
1024 to write filters for.
1025 An example would be a mail message text highlighter,
1026 that means a program that makes use of a color terminal to display
1027 header lines, quotations, and signatures in distinct colors.
1028 The author's version of this program, for instance,
1029 is a 25 line awk script.
1031 .PP
1032 .B "Build a prototype as soon as possible
1033 was again well followed by \s-1MH\s0.
1034 This tenet, of course, focuses on early development, which is
1035 long time ago for \s-1MH\s0.
1036 But without following this guideline at the very beginning,
1037 Bruce Borden may have not convinced the management of \s-1RAND\s0
1038 to ever create \s-1MH\s0.
1039 In Bruce' own words:
1040 .QP
1041 [...] but they [Stockton Gaines and Norm Shapiro] were not able
1042 to convince anyone that such a system would be fast enough to be usable.
1043 I proposed a very short project to prove the basic concepts,
1044 and my management agreed.
1045 Looking back, I realize that I had been very lucky with my first design.
1046 Without nearly enough design work,
1047 I built a working environment and some header files
1048 with key structures and wrote the first few \s-1MH\s0 commands:
1049 inc, show/next/prev, and comp.
1050 [...]
1051 With these three, I was able to convince people that the structure was viable.
1052 This took about three weeks.
1053 .[
1054 %O FIXME
1055 .]
1057 .NH 2
1058 Problems
1059 .LP
1060 \s-1MH\s0, for sure is not without problems.
1061 There are two main problems: one is technical, the other is about human behavior.
1062 .PP
1063 \s-1MH\s0 is old and email today is very different to email in the time
1064 when \s-1MH\s0 was designed.
1065 \s-1MH\s0 adopted to the changes pretty well, but it is limited.
1066 For example in development resources.
1067 \s-1MIME\s0 support and support for different character encodings
1068 is available, but only on a moderate level.
1069 More active developers could quickly improve there.
1070 It is also limited by design, which is the larger problem.
1071 \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extend.
1072 These design conflicts are not easily solvable.
1073 Possibly, they require a redesign.
1074 Maybe \s-1IMAP\s0 is too different to the classic mail model which \s-1MH\s0 covers,
1075 hence \s-1MH\s0 may never work well with \s-1IMAP\s0.
1076 .PP
1077 The other kind of problem is human habits.
1078 When in this world almost all \s-1MUA\s0s are monolithic,
1079 it is very difficult to convince people to use a toolbox style \s-1MUA\s0
1080 like \s-1MH\s0.
1081 The habits are so strong, that even people who understood the concept
1082 and advantages of \s-1MH\s0 do not like to switch,
1083 simply because \s-1MH\s0 is different.
1084 Unfortunately, the frontends to \s-1MH\s0, which could provide familiar look'n'feel,
1085 are quite outdated and thus not very appealing compared to the modern interfaces
1086 which monolithic \s-1MUA\s0s offer.
1088 .NH 2
1089 Summary \s-1MH\s0
1090 .LP
1091 \s-1MH\s0 is an \s-1MUA\s0 that follows the Unix Philosophy in its design
1092 and implementation.
1093 It consists of a toolchest of small tools, each of them does one job well.
1094 The tools are orthogonal to each other, to a large extend.
1095 However, for historical reasons, there also exist distinct tools
1096 that cover the same task.
1097 .PP
1098 The toolchest approach offers great flexibility to the user.
1099 He can use the complete power of the Unix shell with \s-1MH\s0.
1100 This makes \s-1MH\s0 a very powerful mail system.
1101 Extending and customizing \s-1MH\s0 is easy and encouraged, too.
1102 .PP
1103 Apart from the user's perspective, \s-1MH\s0 is development-friendly.
1104 Its overall design follows clear rules.
1105 The single tools do only one job, thus they are easy to understand,
1106 easy to write, and good to maintain.
1107 They are all independent and do not interfere with the others.
1108 Automated testing of their function is a straight forward task.
1109 .PP
1110 It is sad, that \s-1MH\s0's differentness is its largest problem,
1111 as its differentness is also its largest advantage.
1112 Unfortunately, for most people their habits are stronger
1113 than the attraction of the clear design and the power, \s-1MH\s0 offers.
1117 .NH 1
1118 Case study: uzbl
1119 .LP
1120 The last chapter took a look on the \s-1MUA\s0 \s-1MH\s0,
1121 this chapter is about uzbl, a web browser that adheres to the Unix Philosophy.
1122 ``uzbl'' is the \fIlolcat\fP's word for the English adjective ``usable''.
1123 It is pronounced the identical.
1125 .NH 2
1126 Historical background
1127 .LP
1128 Uzbl was started by Dieter Plaetinck in April 2009.
1129 The idea was born in a thread in the Arch Linux forum.
1130 .[
1131 %O http://bbs.archlinux.org/viewtopic.php?id=67463
1132 .]
1133 After some discussion about failures of well known web browsers,
1134 Plaetinck (alias Dieter@be) came up with a very sketchy proposal
1135 of how a better web browser could look like.
1136 To the question of another member, if Plaetinck would write that program,
1137 because it would sound fantastic, Plaetinck replied:
1138 ``Maybe, if I find the time ;-)''.
1139 .PP
1140 Fortunately, he found the time.
1141 One day later, the first prototype was out.
1142 One week later, uzbl had an own website.
1143 One month after the first code showed up,
1144 a mailing list was installed to coordinate and discuss further development.
1145 A wiki was set up to store documentation and scripts that showed up on the
1146 mailing list and elsewhere.
1147 .PP
1148 In the, now, one year of uzbl's existance, it was heavily developed in various branches.
1149 Plaetinck's task became more and more to only merge the best code from the
1150 different branches into his main branch, and to apply patches.
1151 About once a month, Plaetinck released a new version.
1152 In September 2009, he presented several forks of uzbl.
1153 Uzbl, acutally, opened the field for a whole family of web browsers with similar shape.
1154 .PP
1155 In July 2009, \fILinux Weekly News\fP published an interview with Plaetinck about uzbl.
1156 In September 2009, the uzbl web browser was on \fISlashdot\fP.
1158 .NH 2
1159 Contrasts to other web browsers
1160 .LP
1161 Like most \s-1MUA\s0s are monolithic, but \s-1MH\s0 is a toolchest,
1162 most web browsers are monolithic, but uzbl is a frontend to a toolchest.
1163 .PP
1164 Today, uzbl is divided into uzbl-core and uzbl-browser.
1165 Uzbl-core is, how its name already indicates, the core of uzbl.
1166 It handles commands and events to interface other programs,
1167 and also displays webpages by using webkit as render engine.
1168 Uzbl-browser combines uzbl-core with a bunch of handler scripts, a status bar,
1169 an event manager, yanking, pasting, page searching, zooming, and more stuff,
1170 to form a ``complete'' web browser.
1171 In the following text, the term ``uzbl'' usually stands for uzbl-browser,
1172 so uzbl-core is included.
1173 .PP
1174 Unlike most other web browsers, uzbl is mainly the mediator between the
1175 various tools that cover single jobs of web browsing.
1176 Uzbl listens for commands on a named pipe (fifo), a Unix socket, and on stdin.
1177 It writes events to a Unix socket and to stdout.
1178 Loading a webpage in a running uzbl instance requires not more than:
1179 .DS
1180 .CW
1181 echo 'uri http://example.org' >/path/to/uzbl-fifo
1182 .DE
1183 The graphical rendering of the webpage is done by webkit,
1184 which is a library that cares about the whole rendering task.
1185 .PP
1186 Downloads, browsing history, bookmarks, and thelike are not provided
1187 by uzbl-core itself, as they are in other web browsers.
1188 Uzbl-browser only provides, so called, handler scripts that wrap
1189 external applications which provide such function.
1190 For instance, \fIwget\fP is used to download files and uzbl-browser
1191 includes a script that calls wget with appropriate options in
1192 a prepared environment.
1193 .PP
1194 Modern web browsers are proud to have addons, plugins, and modules, instead.
1195 This is their effort to achieve similar goals.
1196 But instead of using existing, external programs, the functions are
1197 integrated into the web browser, just not compiled into it.
1199 .NH 2
1200 Discussion of the design
1201 .LP
1202 This section discusses uzbl in regard of the Unix Philosophy,
1203 as identified by Gancarz.
1205 .PP
1206 .B "Small is beautiful
1207 and
1208 .B "make each program do one thing well" .
1210 .PP
1211 .B "Build a prototype as soon as possible" .
1213 .PP
1214 .B "Use software leverage to your advantage
1215 and
1216 .B "Use shell scripts to increase leverage and portability" .
1218 .PP
1219 .B "Avoid captive user interfaces" .
1221 .PP
1222 .B "Make every program a filter" .
1225 .NH 2
1226 Problems
1227 .LP
1228 broken web
1231 .NH 2
1232 Summary uzbl
1233 .LP
1237 .NH 1
1238 Final thoughts
1240 .NH 2
1241 Quick summary
1242 .LP
1243 good design
1244 .LP
1245 unix phil
1246 .LP
1247 case studies
1249 .NH 2
1250 Why people should choose
1251 .LP
1252 Make the right choice!
1254 .nr PI .5i
1255 .rm ]<
1256 .de ]<
1257 .LP
1258 .de FP
1259 .IP \\\\$1.
1260 \\..
1261 .rm FS FE
1262 ..
1263 .SH
1264 References
1265 .[
1266 $LIST$
1267 .]
1268 .wh -1p