meillo@2: .\".if n .pl 1000i meillo@0: .de XX meillo@0: .pl 1v meillo@0: .. meillo@0: .em XX meillo@1: .\".nr PI 0 meillo@1: .\".if t .nr PD .5v meillo@1: .\".if n .nr PD 1v meillo@0: .nr lu 0 meillo@0: .de CW meillo@0: .nr PQ \\n(.f meillo@0: .if t .ft CW meillo@17: .ie ^\\$1^^ .if n .ul 999 meillo@0: .el .if n .ul 1 meillo@17: .if t .if !^\\$1^^ \&\\$1\f\\n(PQ\\$2 meillo@0: .if n .if \\n(.$=1 \&\\$1 meillo@0: .if n .if \\n(.$>1 \&\\$1\c meillo@0: .if n .if \\n(.$>1 \&\\$2 meillo@0: .. meillo@0: .ds [. \ [ meillo@0: .ds .] ] meillo@1: .\"---------------------------------------- meillo@0: .TL meillo@6: Why the Unix Philosophy still matters meillo@0: .AU meillo@0: markus schnalke meillo@0: .AB meillo@1: .ti \n(.iu meillo@2: This paper discusses the importance of the Unix Philosophy in software design. meillo@0: Today, few software designers are aware of these concepts, meillo@3: and thus most modern software is limited and does not make use of software leverage. meillo@0: Knowing and following the tenets of the Unix Philosophy makes software more valuable. meillo@0: .AE meillo@0: meillo@10: .\".if t .2C meillo@2: meillo@2: .FS meillo@2: .ps -1 meillo@2: This paper was prepared for the seminar ``Software Analysis'' at University Ulm. meillo@2: Mentor was professor Schweiggert. 2010-02-05 meillo@2: .br meillo@2: You may get this document from my website meillo@2: .CW \s-1http://marmaro.de/docs meillo@2: .FE meillo@2: meillo@0: .NH 1 meillo@0: Introduction meillo@0: .LP meillo@0: Building a software is a process from an idea of the purpose of the software meillo@3: to its release. meillo@0: No matter \fIhow\fP the process is run, two things are common: meillo@0: the initial idea and the release. meillo@9: The process in between can be of any shape. meillo@9: The the maintenance work after the release is ignored for the moment. meillo@1: .PP meillo@0: The process of building splits mainly in two parts: meillo@0: the planning of what and how to build, and implementing the plan by writing code. meillo@3: This paper focuses on the planning part \(en the designing of the software. meillo@3: .PP meillo@3: Software design is the plan of how the internals and externals of the software should look like, meillo@3: based on the requirements. meillo@9: This paper discusses the recommendations of the Unix Philosophy about software design. meillo@3: .PP meillo@3: The here discussed ideas can get applied by any development process. meillo@9: The Unix Philosophy does recommend how the software development process should look like, meillo@3: but this shall not be of matter here. meillo@0: Similar, the question of how to write the code is out of focus. meillo@1: .PP meillo@3: The name ``Unix Philosophy'' was already mentioned several times, but it was not explained yet. meillo@1: The Unix Philosophy is the essence of how the Unix operating system and its toolchest was designed. meillo@3: It is no limited set of rules, but what people see to be common to typical Unix software. meillo@1: Several people stated their view on the Unix Philosophy. meillo@1: Best known are: meillo@1: .IP \(bu meillo@1: Doug McIlroy's summary: ``Write programs that do one thing and do it well.'' meillo@1: .[ meillo@1: %A M. D. McIlroy meillo@1: %A E. N. Pinson meillo@1: %A B. A. Taque meillo@1: %T UNIX Time-Sharing System Forward meillo@1: %J The Bell System Technical Journal meillo@1: %D 1978 meillo@1: %V 57 meillo@1: %N 6 meillo@1: %P 1902 meillo@1: .] meillo@1: .IP \(bu meillo@1: Mike Gancarz' book ``The UNIX Philosophy''. meillo@1: .[ meillo@1: %A Mike Gancarz meillo@1: %T The UNIX Philosophy meillo@1: %D 1995 meillo@1: %I Digital Press meillo@1: .] meillo@1: .IP \(bu meillo@1: Eric S. Raymond's book ``The Art of UNIX Programming''. meillo@1: .[ meillo@1: %A Eric S. Raymond meillo@1: %T The Art of UNIX Programming meillo@1: %D 2003 meillo@1: %I Addison-Wesley meillo@2: %O .CW \s-1http://www.faqs.org/docs/artu/ meillo@1: .] meillo@0: .LP meillo@1: These different views on the Unix Philosophy have much in common. meillo@3: Especially, the main concepts are similar for all of them. meillo@1: But there are also points on which they differ. meillo@1: This only underlines what the Unix Philosophy is: meillo@1: A retrospective view on the main concepts of Unix software; meillo@9: especially those that were successful and unique to Unix. meillo@6: .\" really? meillo@1: .PP meillo@1: Before we will have a look at concrete concepts, meillo@1: we discuss why software design is important meillo@1: and what problems bad design introduces. meillo@0: meillo@0: meillo@0: .NH 1 meillo@6: Importance of software design in general meillo@0: .LP meillo@2: Why should we design software at all? meillo@6: It is general knowledge, that even a bad plan is better than no plan. meillo@6: Ignoring software design is programming without a plan. meillo@6: This will lead pretty sure to horrible results. meillo@2: .PP meillo@6: The design of a software is its internal and external shape. meillo@6: The design talked about here has nothing to do with visual appearance. meillo@6: If we see a program as a car, then its color is of no matter. meillo@6: Its design would be the car's size, its shape, the number and position of doors, meillo@6: the ratio of passenger and cargo transport, and so forth. meillo@2: .PP meillo@6: A software's design is about quality properties. meillo@6: Each of the cars may be able to drive from A to B, meillo@6: but it depends on its properties whether it is a good car for passenger transport or not. meillo@6: It also depends on its properties if it is a good choice for a rough mountain area. meillo@2: .PP meillo@6: Requirements to a software are twofold: functional and non-functional. meillo@6: Functional requirements are easier to define and to verify. meillo@6: They are directly the software's functions. meillo@6: Functional requirements are the reason why software gets written. meillo@6: Someone has a problem and needs a tool to solve it. meillo@6: Being able to solve the problem is the main functional requirement. meillo@6: It is the driving force behind all programming effort. meillo@2: .PP meillo@6: On the other hand, there are also non-functional requirements. meillo@6: They are called \fIquality\fP requirements, too. meillo@6: The quality of a software is about properties that are not directly related to meillo@6: the software's basic functions. meillo@6: Quality aspects are about the properties that are overlooked at first sight. meillo@2: .PP meillo@6: Quality is of few matter when the software gets initially built, meillo@9: but it will be of matter in usage and maintenance of the software. meillo@6: A short-sighted might see in developing a software mainly building something up. meillo@6: Reality shows, that building the software the first time is only a small amount meillo@6: of the overall work. meillo@9: Bug fixing, extending, rebuilding of parts \(en short: maintenance work \(en meillo@6: does soon take over the major part of the time spent on a software. meillo@6: Not to forget the usage of the software. meillo@6: These processes are highly influenced by the software's quality. meillo@6: Thus, quality should never be neglected. meillo@6: The problem is that you hardly ``stumble over'' bad quality during the first build, meillo@6: but this is the time when you should care about good quality most. meillo@6: .PP meillo@6: Software design is not about the basic function of a software; meillo@6: this requirement will get satisfied anyway, as it is the main driving force behind the development. meillo@6: Software design is about quality aspects of the software. meillo@6: Good design will lead to good quality, bad design to bad quality. meillo@6: The primary functions of the software will be affected modestly by bad quality, meillo@6: but good quality can provide a lot of additional gain from the software, meillo@6: even at places where one never expected it. meillo@6: .PP meillo@6: The ISO/IEC 9126-1 standard, part 1, meillo@6: .[ meillo@9: %I International Organization for Standardization meillo@6: %T ISO Standard 9126: Software Engineering \(en Product Quality, part 1 meillo@6: %C Geneve meillo@6: %D 2001 meillo@6: .] meillo@6: defines the quality model as consisting out of: meillo@6: .IP \(bu meillo@6: .I Functionality meillo@6: (suitability, accuracy, inter\%operability, security) meillo@6: .IP \(bu meillo@6: .I Reliability meillo@6: (maturity, fault tolerance, recoverability) meillo@6: .IP \(bu meillo@6: .I Usability meillo@6: (understandability, learnability, operability, attractiveness) meillo@6: .IP \(bu meillo@6: .I Efficiency meillo@9: (time behavior, resource utilization) meillo@6: .IP \(bu meillo@6: .I Maintainability meillo@23: (analyzability, changeability, stability, testability) meillo@6: .IP \(bu meillo@6: .I Portability meillo@6: (adaptability, installability, co-existence, replaceability) meillo@6: .LP meillo@6: These goals are parts of a software's design. meillo@6: Good design can give these properties to a software, meillo@6: bad designed software will miss them. meillo@7: .PP meillo@7: One further goal of software design is consistency. meillo@7: Consistency eases understanding, working on, and using things. meillo@7: Consistent internals and consistent interfaces to the outside can be provided by good design. meillo@7: .PP meillo@7: We should design software because good design avoids many problems during a software's lifetime. meillo@7: And we should design software because good design can offer much gain, meillo@7: that can be unrelated to the software main intend. meillo@7: Indeed, we should spend much effort into good design to make the software more valuable. meillo@7: The Unix Philosophy shows how to design software well. meillo@7: It offers guidelines to achieve good quality and high gain for the effort spent. meillo@0: meillo@0: meillo@0: .NH 1 meillo@0: The Unix Philosophy meillo@4: .LP meillo@4: The origins of the Unix Philosophy were already introduced. meillo@8: This chapter explains the philosophy, oriented on Gancarz, meillo@8: and shows concrete examples of its application. meillo@5: meillo@16: .NH 2 meillo@14: Pipes meillo@4: .LP meillo@4: Following are some examples to demonstrate how applied Unix Philosophy feels like. meillo@4: Knowledge of using the Unix shell is assumed. meillo@4: .PP meillo@4: Counting the number of files in the current directory: meillo@9: .DS I 2n meillo@4: .CW meillo@9: .ps -1 meillo@4: ls | wc -l meillo@4: .DE meillo@4: The meillo@4: .CW ls meillo@4: command lists all files in the current directory, one per line, meillo@4: and meillo@4: .CW "wc -l meillo@8: counts the number of lines. meillo@4: .PP meillo@8: Counting the number of files that do not contain ``foo'' in their name: meillo@9: .DS I 2n meillo@4: .CW meillo@9: .ps -1 meillo@4: ls | grep -v foo | wc -l meillo@4: .DE meillo@4: Here, the list of files is filtered by meillo@4: .CW grep meillo@4: to remove all that contain ``foo''. meillo@4: The rest is the same as in the previous example. meillo@4: .PP meillo@4: Finding the five largest entries in the current directory. meillo@9: .DS I 2n meillo@4: .CW meillo@9: .ps -1 meillo@4: du -s * | sort -nr | sed 5q meillo@4: .DE meillo@4: .CW "du -s * meillo@4: returns the recursively summed sizes of all files meillo@8: \(en no matter if they are regular files or directories. meillo@4: .CW "sort -nr meillo@4: sorts the list numerically in reverse order. meillo@4: Finally, meillo@4: .CW "sed 5q meillo@4: quits after it has printed the fifth line. meillo@4: .PP meillo@4: The presented command lines are examples of what Unix people would use meillo@4: to get the desired output. meillo@4: There are also other ways to get the same output. meillo@4: It's a user's decision which way to go. meillo@14: .PP meillo@8: The examples show that many tasks on a Unix system meillo@4: are accomplished by combining several small programs. meillo@4: The connection between the single programs is denoted by the pipe operator `|'. meillo@4: .PP meillo@4: Pipes, and their extensive and easy use, are one of the great meillo@4: achievements of the Unix system. meillo@4: Pipes between programs have been possible in earlier operating systems, meillo@4: but it has never been a so central part of the concept. meillo@4: When, in the early seventies, Doug McIlroy introduced pipes for the meillo@4: Unix system, meillo@4: ``it was this concept and notation for linking several programs together meillo@4: that transformed Unix from a basic file-sharing system to an entirely new way of computing.'' meillo@4: .[ meillo@4: %T Unix: An Oral History meillo@5: %O .CW \s-1http://www.princeton.edu/~hos/frs122/unixhist/finalhis.htm meillo@4: .] meillo@4: .PP meillo@4: Being able to specify pipelines in an easy way is, meillo@4: however, not enough by itself. meillo@5: It is only one half. meillo@4: The other is the design of the programs that are used in the pipeline. meillo@8: They have to interfaces that allows them to be used in such a way. meillo@5: meillo@16: .NH 2 meillo@14: Interface design meillo@5: .LP meillo@11: Unix is, first of all, simple \(en Everything is a file. meillo@5: Files are sequences of bytes, without any special structure. meillo@5: Programs should be filters, which read a stream of bytes from ``standard input'' (stdin) meillo@5: and write a stream of bytes to ``standard output'' (stdout). meillo@5: .PP meillo@8: If the files \fIare\fP sequences of bytes, meillo@8: and the programs \fIare\fP filters on byte streams, meillo@11: then there is exactly one standardized data interface. meillo@5: Thus it is possible to combine them in any desired way. meillo@5: .PP meillo@5: Even a handful of small programs will yield a large set of combinations, meillo@5: and thus a large set of different functions. meillo@5: This is leverage! meillo@5: If the programs are orthogonal to each other \(en the best case \(en meillo@5: then the set of different functions is greatest. meillo@5: .PP meillo@11: Programs might also have a separate control interface, meillo@11: besides their data interface. meillo@11: The control interface is often called ``user interface'', meillo@11: because it is usually designed to be used by humans. meillo@11: The Unix Philosophy discourages to assume the user to be human. meillo@11: Interactive use of software is slow use of software, meillo@11: because the program waits for user input most of the time. meillo@11: Interactive software requires the user to be in front of the computer meillo@11: all the time. meillo@11: Interactive software occupy the user's attention while they are running. meillo@11: .PP meillo@11: Now we come back to the idea of using several small programs, combined, meillo@11: to have a more specific function. meillo@11: If these single tools would all be interactive, meillo@11: how would the user control them? meillo@11: It is not only a problem to control several programs at once if they run at the same time, meillo@11: it also very inefficient to have to control each of the single programs meillo@11: that are intended to work as one large program. meillo@11: Hence, the Unix Philosophy discourages programs to demand interactive use. meillo@11: The behavior of programs should be defined at invocation. meillo@11: This is done by specifying arguments (``command line switches'') to the program call. meillo@11: Gancarz discusses this topic as ``avoid captive user interfaces''. meillo@11: .[ meillo@11: %A Mike Gancarz meillo@11: %T The UNIX Philosophy meillo@11: %I Digital Press meillo@11: %D 1995 meillo@11: %P 88 ff. meillo@11: .] meillo@11: .PP meillo@11: Non-interactive use is, during development, also an advantage for testing. meillo@11: Testing of interactive programs is much more complicated, meillo@11: than testing of non-interactive programs. meillo@5: meillo@16: .NH 2 meillo@8: The toolchest approach meillo@5: .LP meillo@5: A toolchest is a set of tools. meillo@5: Instead of having one big tool for all tasks, one has many small tools, meillo@5: each for one task. meillo@5: Difficult tasks are solved by combining several of the small, simple tools. meillo@5: .PP meillo@11: The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs meillo@11: that are filters on byte streams. meillo@11: They are, to a large extend, unrelated in their function. meillo@11: Hence, the Unix toolchest provides a large set of functions meillo@11: that can be accessed by combining the programs in the desired way. meillo@11: .PP meillo@11: There are also advantages for developing small toolchest programs. meillo@5: It is easier and less error-prone to write small programs. meillo@5: It is also easier and less error-prone to write a large set of small programs, meillo@5: than to write one large program with all the functionality included. meillo@5: If the small programs are combinable, then they offer even a larger set meillo@5: of functions than the single large program. meillo@5: Hence, one gets two advantages out of writing small, combinable programs. meillo@5: .PP meillo@5: There are two drawbacks of the toolchest approach. meillo@8: First, one simple, standardized, unidirectional interface has to be sufficient. meillo@5: If one feels the need for more ``logic'' than a stream of bytes, meillo@8: then a different approach might be of need. meillo@13: But it is also possible, that he just can not imagine a design where meillo@8: a stream of bytes is sufficient. meillo@8: By becoming more familiar with the ``Unix style of thinking'', meillo@8: developers will more often and easier find simple designs where meillo@8: a stream of bytes is a sufficient interface. meillo@8: .PP meillo@8: The second drawback of a toolchest affects the users. meillo@5: A toolchest is often more difficult to use for novices. meillo@9: It is necessary to become familiar with each of the tools, meillo@5: to be able to use the right one in a given situation. meillo@9: Additionally, one needs to combine the tools in a senseful way on its own. meillo@9: This is like a sharp knife \(en it is a powerful tool in the hand of a master, meillo@5: but of no good value in the hand of an unskilled. meillo@5: .PP meillo@8: However, learning single, small tool of the toolchest is easier than meillo@8: learning a complex tool. meillo@8: The user will have a basic understanding of a yet unknown tool, meillo@8: if the several tools of the toolchest have a common style. meillo@8: He will be able to transfer knowledge over one tool to another. meillo@8: .PP meillo@8: Moreover, the second drawback can be removed easily by adding wrappers meillo@8: around the single tools. meillo@5: Novice users do not need to learn several tools if a professional wraps meillo@8: the single commands into a more high-level script. meillo@5: Note that the wrapper script still calls the small tools; meillo@5: the wrapper script is just like a skin around. meillo@8: No complexity is added this way, meillo@8: but new programs can get created out of existing one with very low effort. meillo@5: .PP meillo@5: A wrapper script for finding the five largest entries in the current directory meillo@5: could look like this: meillo@9: .DS I 2n meillo@5: .CW meillo@9: .ps -1 meillo@5: #!/bin/sh meillo@5: du -s * | sort -nr | sed 5q meillo@5: .DE meillo@5: The script itself is just a text file that calls the command line meillo@5: a professional user would type in directly. meillo@8: Making the program flexible on the number of entries it prints, meillo@8: is easily possible: meillo@9: .DS I 2n meillo@8: .CW meillo@9: .ps -1 meillo@8: #!/bin/sh meillo@8: num=5 meillo@8: [ $# -eq 1 ] && num="$1" meillo@8: du -sh * | sort -nr | sed "${num}q" meillo@8: .DE meillo@8: This script acts like the one before, when called without an argument. meillo@8: But one can also specify a numerical argument to define the number of lines to print. meillo@5: meillo@16: .NH 2 meillo@8: A powerful shell meillo@8: .LP meillo@10: It was already said, that the Unix shell provides the possibility to meillo@10: combine small programs into large ones easily. meillo@10: A powerful shell is a great feature in other ways, too. meillo@8: .PP meillo@10: For instance by including a scripting language. meillo@10: The control statements are build into the shell. meillo@8: The functions, however, are the normal programs, everyone can use on the system. meillo@10: Thus, the programs are known, so learning to program in the shell is easy. meillo@8: Using normal programs as functions in the shell programming language meillo@10: is only possible because they are small and combinable tools in a toolchest style. meillo@8: .PP meillo@8: The Unix shell encourages to write small scripts out of other programs, meillo@8: because it is so easy to do. meillo@8: This is a great step towards automation. meillo@8: It is wonderful if the effort to automate a task equals the effort meillo@8: it takes to do it the second time by hand. meillo@8: If it is so, then the user will be happy to automate everything he does more than once. meillo@8: .PP meillo@8: Small programs that do one job well, standardized interfaces between them, meillo@8: a mechanism to combine parts to larger parts, and an easy way to automate tasks, meillo@8: this will inevitably produce software leverage. meillo@8: Getting multiple times the benefit of an investment is a great offer. meillo@10: .PP meillo@10: The shell also encourages rapid prototyping. meillo@10: Many well known programs started as quickly hacked shell scripts, meillo@10: and turned into ``real'' programs, written in C, later. meillo@10: Building a prototype first is a way to avoid the biggest problems meillo@10: in application development. meillo@10: Fred Brooks writes in ``No Silver Bullet'': meillo@10: .[ meillo@10: %A Frederick P. Brooks, Jr. meillo@10: %T No Silver Bullet: Essence and Accidents of Software Engineering meillo@10: %B Information Processing 1986, the Proceedings of the IFIP Tenth World Computing Conference meillo@10: %E H.-J. Kugler meillo@10: %D 1986 meillo@10: %P 1069\(en1076 meillo@10: %I Elsevier Science B.V. meillo@10: %C Amsterdam, The Netherlands meillo@10: .] meillo@10: .QP meillo@10: The hardest single part of building a software system is deciding precisely what to build. meillo@10: No other part of the conceptual work is so difficult as establishing the detailed meillo@10: technical requirements, [...]. meillo@10: No other part of the work so cripples the resulting system if done wrong. meillo@10: No other part is more difficult to rectify later. meillo@10: .PP meillo@10: Writing a prototype is a great method to become familiar with the requirements meillo@10: and to actually run into real problems. meillo@10: Today, prototyping is often seen as a first step in building a software. meillo@10: This is, of course, good. meillo@10: However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping: meillo@10: After having built the prototype, one might notice, that the prototype is already meillo@10: \fIgood enough\fP. meillo@10: Hence, no reimplementation, in a more sophisticated programming language, might be of need, meillo@10: for the moment. meillo@23: Maybe later, it might be necessary to rewrite the software, but not now. meillo@10: .PP meillo@10: By delaying further work, one keeps the flexibility to react easily on meillo@10: changing requirements. meillo@10: Software parts that are not written will not miss the requirements. meillo@10: meillo@16: .NH 2 meillo@10: Worse is better meillo@10: .LP meillo@10: The Unix Philosophy aims for the 80% solution; meillo@10: others call it the ``Worse is better'' approach. meillo@10: .PP meillo@10: First, practical experience shows, that it is almost never possible to define the meillo@10: requirements completely and correctly the first time. meillo@10: Hence one should not try to; it will fail anyway. meillo@10: Second, practical experience shows, that requirements change during time. meillo@10: Hence it is best to delay requirement-based design decisions as long as possible. meillo@10: Also, the software should be small and flexible as long as possible meillo@10: to react on changing requirements. meillo@10: Shell scripts, for example, are more easily adjusted as C programs. meillo@10: Third, practical experience shows, that maintenance is hard work. meillo@10: Hence, one should keep the amount of software as small as possible; meillo@10: it should just fulfill the \fIcurrent\fP requirements. meillo@10: Software parts that will be written later, do not need maintenance now. meillo@10: .PP meillo@10: Starting with a prototype in a scripting language has several advantages: meillo@10: .IP \(bu meillo@10: As the initial effort is low, one will likely start right away. meillo@10: .IP \(bu meillo@10: As working parts are available soon, the real requirements can get identified soon. meillo@10: .IP \(bu meillo@10: When a software is usable, it gets used, and thus tested. meillo@10: Hence problems will be found at early stages of the development. meillo@10: .IP \(bu meillo@10: The prototype might be enough for the moment, meillo@10: thus further work on the software can be delayed to a time meillo@10: when one knows better about the requirements and problems, meillo@10: than now. meillo@10: .IP \(bu meillo@10: Implementing now only the parts that are actually needed now, meillo@10: requires fewer maintenance work. meillo@10: .IP \(bu meillo@10: If the global situation changes so that the software is not needed anymore, meillo@10: then less effort was spent into the project, than it would have be meillo@10: when a different approach had been used. meillo@10: meillo@16: .NH 2 meillo@11: Upgrowth and survival of software meillo@11: .LP meillo@12: So far it was talked about \fIwriting\fP or \fIbuilding\fP software. meillo@13: Although these are just verbs, they do imply a specific view on the work process meillo@13: they describe. meillo@12: The better verb, however, is to \fIgrow\fP. meillo@12: .PP meillo@12: Creating software in the sense of the Unix Philosophy is an incremental process. meillo@12: It starts with a first prototype, which evolves as requirements change. meillo@12: A quickly hacked shell script might become a large, sophisticated, meillo@13: compiled program this way. meillo@13: Its lifetime begins with the initial prototype and ends when the software is not used anymore. meillo@13: While being alive it will get extended, rearranged, rebuilt (from scratch). meillo@12: Growing software matches the view that ``software is never finished. It is only released.'' meillo@12: .[ meillo@13: %O FIXME meillo@13: %A Mike Gancarz meillo@13: %T The UNIX Philosophy meillo@13: %P 26 meillo@12: .] meillo@12: .PP meillo@13: Software can be seen as being controlled by evolutionary processes. meillo@13: Successful software is software that is used by many for a long time. meillo@12: This implies that the software is needed, useful, and better than alternatives. meillo@12: Darwin talks about: ``The survival of the fittest.'' meillo@12: .[ meillo@13: %O FIXME meillo@13: %A Charles Darwin meillo@12: .] meillo@12: Transferred to software: The most successful software, is the fittest, meillo@12: is the one that survives. meillo@13: (This may be at the level of one creature, or at the level of one species.) meillo@13: The fitness of software is affected mainly by four properties: meillo@15: portability of code, portability of data, range of usability, and reusability of parts. meillo@15: .\" .IP \(bu meillo@15: .\" portability of code meillo@15: .\" .IP \(bu meillo@15: .\" portability of data meillo@15: .\" .IP \(bu meillo@15: .\" range of usability meillo@15: .\" .IP \(bu meillo@15: .\" reuseability of parts meillo@13: .PP meillo@15: (1) meillo@15: .I "Portability of code meillo@15: means, using high-level programming languages, meillo@13: sticking to the standard, meillo@13: and avoiding optimizations that introduce dependencies on specific hardware. meillo@13: Hardware has a much lower lifetime than software. meillo@13: By chaining software to a specific hardware, meillo@13: the software's lifetime gets shortened to that of this hardware. meillo@13: In contrast, software should be easy to port \(en meillo@23: adaptation is the key to success. meillo@13: .\" cf. practice of prog: ch08 meillo@13: .PP meillo@15: (2) meillo@15: .I "Portability of data meillo@15: is best achieved by avoiding binary representations meillo@13: to store data, because binary representations differ from machine to machine. meillo@23: Textual representation is favored. meillo@13: Historically, ASCII was the charset of choice. meillo@13: In the future, UTF-8 might be the better choice, however. meillo@13: Important is that it is a plain text representation in a meillo@13: very common charset encoding. meillo@13: Apart from being able to transfer data between machines, meillo@13: readable data has the great advantage, that humans are able meillo@13: to directly edit it with text editors and other tools from the Unix toolchest. meillo@13: .\" gancarz tenet 5 meillo@13: .PP meillo@15: (3) meillo@15: A large meillo@15: .I "range of usability meillo@23: ensures good adaptation, and thus good survival. meillo@13: It is a special distinction if a software becomes used in fields of action, meillo@13: the original authors did never imagine. meillo@13: Software that solves problems in a general way will likely be used meillo@13: for all kinds of similar problems. meillo@13: Being too specific limits the range of uses. meillo@13: Requirements change through time, thus use cases change or even vanish. meillo@13: A good example in this point is Allman's sendmail. meillo@13: Allman identifies flexibility to be one major reason for sendmail's success: meillo@13: .[ meillo@13: %O FIXME meillo@13: %A Allman meillo@13: %T sendmail meillo@13: .] meillo@13: .QP meillo@13: Second, I limited myself to the routing function [...]. meillo@13: This was a departure from the dominant thought of the time, [...]. meillo@13: .QP meillo@13: Third, the sendmail configuration file was flexible enough to adopt meillo@13: to a rapidly changing world [...]. meillo@12: .LP meillo@13: Successful software adopts itself to the changing world. meillo@13: .PP meillo@15: (4) meillo@15: .I "Reuse of parts meillo@15: is even one step further. meillo@13: A software may completely lose its field of action, meillo@13: but parts of which the software is build may be general and independent enough meillo@13: to survive this death. meillo@13: If software is build by combining small independent programs, meillo@13: then there are parts readily available for reuse. meillo@13: Who cares if the large program is a failure, meillo@13: but parts of it become successful instead? meillo@10: meillo@16: .NH 2 meillo@14: Summary meillo@0: .LP meillo@14: This chapter explained the central ideas of the Unix Philosophy. meillo@14: For each of the ideas, it was exposed what advantages they introduce. meillo@14: The Unix Philosophy are guidelines that help to write valuable software. meillo@14: From the view point of a software developer or software designer, meillo@14: the Unix Philosophy provides answers to many software design problem. meillo@14: .PP meillo@14: The various ideas of the Unix Philosophy are very interweaved meillo@14: and can hardly be applied independently. meillo@14: However, the probably most important messages are: meillo@14: .I "``Do one thing well!''" , meillo@14: .I "``Keep it simple!''" , meillo@14: and meillo@14: .I "``Use software leverage!'' meillo@0: meillo@8: meillo@8: meillo@0: .NH 1 meillo@19: Case study: \s-1MH\s0 meillo@18: .LP meillo@18: The last chapter introduced and explained the Unix Philosophy meillo@18: from a general point of view. meillo@23: The driving force were the guidelines, references to meillo@18: existing software were given only sparsely. meillo@18: In this and the next chapter, concrete software will be meillo@18: the driving force in the discussion. meillo@18: .PP meillo@23: This first case study is about the mail user agents (\s-1MUA\s0) meillo@23: \s-1MH\s0 (``mail handler'') and its descendent \fInmh\fP meillo@23: (``new mail handler''). meillo@23: \s-1MUA\s0s provide functions to read, compose, and organize mail, meillo@23: but (ideally) not to transfer. meillo@19: In this document, the name \s-1MH\s0 will be used for both of them. meillo@19: A distinction will only be made if differences between meillo@19: them are described. meillo@18: meillo@0: meillo@0: .NH 2 meillo@19: Historical background meillo@0: .LP meillo@19: Electronic mail was available in Unix very early. meillo@19: The first \s-1MUA\s0 on Unix was \f(CWmail\fP. meillo@19: It was a small program that either prints the own mailbox file meillo@19: or appends text to someone elses mailbox file, meillo@19: depending on the command line arguments. meillo@19: .[ meillo@19: %O http://cm.bell-labs.com/cm/cs/who/dmr/pdfs/man12.pdf meillo@19: .] meillo@19: It was a program that did one job well. meillo@23: This job was emailing, which was very simple then. meillo@19: .PP meillo@23: Later, emailing became more powerful, and thus more complex. meillo@19: The simple \f(CWmail\fP, which knew nothing of subjects, meillo@19: independent handling of single messages, meillo@19: and long-time storage of them, was not powerful enough anymore. meillo@19: At Berkeley, Kurt Shoens wrote \fIMail\fP (with capital `M') meillo@19: in 1978 to provide additional functions for emailing. meillo@19: Mail was still one program, but now it was large and did meillo@19: several jobs. meillo@23: Its user interface is modeled after the one of \fIed\fP. meillo@19: It is designed for humans, but is still scriptable. meillo@23: \fImailx\fP is the adaptation of Berkeley Mail into System V. meillo@19: .[ meillo@19: %A Gunnar Ritter meillo@19: %O http://heirloom.sourceforge.net/mailx_history.html meillo@19: .] meillo@19: Elm, pine, mutt, and today a whole bunch of graphical \s-1MUA\s0s meillo@19: followed Mail's direction. meillo@19: They are large, monolithic programs which include all emailing functions. meillo@19: .PP meillo@23: A different way was taken by the people of \s-1RAND\s0 Corporation. meillo@19: In the beginning, they also had used a monolitic mail system, meillo@23: called \s-1MS\s0 simply for ``mail system''. meillo@19: But in 1977, Stockton Gaines and Norman Shapiro meillo@19: came up with a proposal of a new email system concept \(en meillo@19: one that honors the Unix Philosophy. meillo@19: The concept was implemented by Bruce Borden in 1978 and 1979. meillo@19: This was the birth of \s-1MH\s0 \(en the ``mail handler''. meillo@18: .PP meillo@18: Since then, \s-1RAND\s0, the University of California at Irvine and meillo@19: at Berkeley, and several others have contributed to the software. meillo@18: However, it's core concepts remained the same. meillo@23: In the late 90s, when development of \s-1MH\s0 slowed down, meillo@19: Richard Coleman started with \fInmh\fP, the new mail handler. meillo@19: His goal was to improve \s-1MH\s0, especially in regard of meillo@23: the requirements of modern emailing. meillo@19: Today, nmh is developed by various people on the Internet. meillo@18: .[ meillo@18: %T RAND and the Information Evolution: A History in Essays and Vignettes meillo@18: %A Willis H. Ware meillo@18: %D 2008 meillo@18: %I The RAND Corporation meillo@18: %P 128\(en137 meillo@18: %O .CW \s-1http://www.rand.org/pubs/corporate_pubs/CP537/ meillo@18: .] meillo@18: .[ meillo@18: %T MH & xmh: Email for Users & Programmers meillo@18: %A Jerry Peek meillo@18: %D 1991, 1992, 1995 meillo@18: %I O'Reilly & Associates, Inc. meillo@18: %P Appendix B meillo@18: %O Also available online: \f(CW\s-2http://rand-mh.sourceforge.net/book/\fP meillo@18: .] meillo@0: meillo@0: .NH 2 meillo@20: Contrasts to monolithic mail systems meillo@0: .LP meillo@19: All \s-1MUA\s0s are monolithic, except \s-1MH\s0. meillo@20: This might not be true, meillo@20: but it reflects the situation pretty well. meillo@19: .PP meillo@19: While monolithic \s-1MUA\s0s gather all function in one program, meillo@19: \s-1MH\s0 is a toolchest of many small tools \(en one for each job. meillo@23: Following is a list of important programs of \s-1MH\s0's toolchest meillo@23: and their function: meillo@19: .IP \(bu meillo@19: .CW inc : meillo@19: incorporate new mail meillo@19: .IP \(bu meillo@19: .CW scan : meillo@19: list messages in folder meillo@19: .IP \(bu meillo@19: .CW show : meillo@19: show message meillo@19: .IP \(bu meillo@19: .CW next\fR/\fPprev : meillo@19: show next/previous message meillo@19: .IP \(bu meillo@19: .CW folder : meillo@19: change current folder meillo@19: .IP \(bu meillo@19: .CW refile : meillo@19: refile message into folder meillo@19: .IP \(bu meillo@19: .CW rmm : meillo@19: remove message meillo@19: .IP \(bu meillo@19: .CW comp : meillo@19: compose a new message meillo@19: .IP \(bu meillo@19: .CW repl : meillo@19: reply to a message meillo@19: .IP \(bu meillo@19: .CW forw : meillo@19: forward a message meillo@19: .IP \(bu meillo@19: .CW send : meillo@19: send a prepared message meillo@0: .LP meillo@19: \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have. meillo@19: The user does not leave the shell to run \s-1MH\s0, meillo@19: but he uses \s-1MH\s0 within the shell. meillo@23: Using a monolithic program with a captive user interface meillo@23: means ``entering'' the program, using it, and ``exiting'' the program. meillo@23: Using toolchests like \s-1MH\s0 means running programs, meillo@23: alone or in combinition with others, even from other toolchests, meillo@23: without leaving the shell. meillo@23: .PP meillo@19: \s-1MH\s0's mail storage is (only little more than) a directory tree meillo@23: where mail folders are directories and mail messages are text files. meillo@19: Working with \s-1MH\s0's toolchest is much like working meillo@19: with Unix' toolchest: meillo@19: \f(CWscan\fP is like \f(CWls\fP, meillo@19: \f(CWshow\fP is like \f(CWcat\fP, meillo@19: \f(CWfolder\fP is like \f(CWcd\fP, meillo@19: \f(CWrefile\fP is like \f(CWmv\fP, meillo@19: and \f(CWrmm\fP is like \f(CWrm\fP. meillo@19: .PP meillo@23: The context of tools in Unix is mainly the current working directory, meillo@19: the user identification, and the environment variables. meillo@19: \s-1MH\s0 extends this context by two more items: meillo@23: .IP \(bu meillo@23: The current mail folder, which is similar to the current working directory. meillo@23: For mail folders, \f(CWfolder\fP provides the corresponding functionality meillo@23: of \f(CWpwd\fP and \f(CWcd\fP for directories. meillo@23: .IP \(bu meillo@23: The current message, relative to the current mail folder, meillo@20: which enables commands like \f(CWnext\fP and \f(CWprev\fP. meillo@23: .LP meillo@19: In contrast to Unix' context, which is chained to the shell session, meillo@19: \s-1MH\s0's context is meant to be chained to a mail account. meillo@20: But actually, the current message is a property of the mail folder, meillo@23: which appears to be a legacy. meillo@20: This will cause problems when multiple users work meillo@20: in one mail folder simultaneously. meillo@0: meillo@20: meillo@0: .NH 2 meillo@20: Discussion of the design meillo@0: .LP meillo@20: The following paragraphs discuss \s-1MH\s0 in regard to the tenets meillo@23: of the Unix Philosophy which Gancarz identified. meillo@20: meillo@20: .PP meillo@20: .I "``Small is beautiful'' meillo@20: and meillo@20: .I "``do one thing well'' meillo@20: are two design goals that are directly visible in \s-1MH\s0. meillo@20: Gancarz actually presents \s-1MH\s0 as example under the headline meillo@20: ``Making UNIX Do One Thing Well'': meillo@20: .QP meillo@20: [\s-1MH\s0] consists of a series of programs which meillo@20: when combined give the user an enormous ability meillo@20: to manipulate electronic mail messages. meillo@20: A complex application, it shows that not only is it meillo@20: possible to build large applications from smaller meillo@20: components, but also that such designs are actually preferable. meillo@20: .[ meillo@20: %A Mike Gancarz meillo@20: %T unix-phil meillo@20: %P 125 meillo@20: .] meillo@20: .LP meillo@20: The various small programs of \s-1MH\s0 were relatively easy meillo@23: to write, because each of them is small, limited to one function, meillo@23: and has clear boundaries. meillo@20: For the same reasons, they are also good to maintain. meillo@20: Further more, the system can easily get extended. meillo@20: One only needs to put a new program into the toolchest. meillo@23: This was done, for instance, when \s-1MIME\s0 support was added meillo@20: (e.g. \f(CWmhbuild\fP). meillo@20: Also, different programs can exist to do the basically same job meillo@20: in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP). meillo@20: If someone needs a mail system with some additionally meillo@23: functions that are available nowhere yet, he best takes a meillo@20: toolchest system like \s-1MH\s0 where he can add the meillo@20: functionality with little work. meillo@20: meillo@20: .PP meillo@20: .I "Data storage. meillo@20: How \s-1MH\s0 stores data was already mentioned. meillo@20: Mail folders are directories (which contain a file meillo@20: \&\f(CW.mh_sequences\fP) under the user's \s-1MH\s0 directory meillo@20: (usually \f(CW$HOME/Mail\fP). meillo@23: Mail messages are text files located in mail folders. meillo@20: The files contain the messages as they were received. meillo@20: The messages are numbered in ascending order in each folder. meillo@20: This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0. meillo@20: Alternatives are \fImbox\fP and \fImaildir\fP. meillo@20: In the mbox format all messages are stored within one file. meillo@20: This was a good solution in the early days, when messages meillo@20: were only a few lines of text and were deleted soon. meillo@20: Today, when single messages often include several megabytes meillo@20: of attachments, it is a bad solution. meillo@20: Another disadvantage of the mbox format is that it is meillo@20: more difficult to write tools that work on mail messages, meillo@23: because it is always necessary to first find and extract meillo@20: the relevant message in the mbox file. meillo@23: With the \s-1MH\s0 mailbox format, meillo@23: each message is a self-standing item, by definition. meillo@20: Also, the problem of concurrent access to one mailbox is meillo@20: reduced to the problem of concurrent access to one message. meillo@20: However, the issue of the shared parts of the context, meillo@20: as mentioned above, remains. meillo@20: Maildir is generally similar to \s-1MH\s0's format, meillo@20: but modified towards guaranteed reliability. meillo@20: This involves some complexity, unfortunately. meillo@20: meillo@20: .PP meillo@20: .I "``Avoid captive user interfaces.'' meillo@19: \s-1MH\s0 is perfectly suited for non-interactive use. meillo@19: It offers all functions directly and without captive user interfaces. meillo@19: If users want a graphical user interface, anyhow, meillo@20: they can have it with \fIxmh\fP or \fIexmh\fP, too. meillo@19: These are graphical frontends for the \s-1MH\s0 toolchest. meillo@19: This means, all email-related work is still done by \s-1MH\s0 tools, meillo@20: but the frontend issues the appropriate calls when the user meillo@20: clicks on a button. meillo@20: Providing easy-to-use user interfaces in form of frontends is a good meillo@19: approach, because it does not limit the power of the backend itself. meillo@20: The frontend will anyway only be able to make a subset of the meillo@23: backend's power and flexibility available to the user. meillo@20: But if it is a separate program, meillo@20: then the missing parts can still be accessed at the backend directly. meillo@19: If it is integrated, then this will hardly be possible. meillo@19: meillo@19: .PP meillo@20: .I "``Choose portability over efficiency'' meillo@20: and meillo@20: .I "``use shell scripts to increase leverage and portability'' . meillo@20: These two tenets are indirectly, but nicely, demonstrated by meillo@20: Bolsky and Korn in their book about the korn shell. meillo@20: .[ meillo@20: %T The KornShell: command and programming language meillo@20: %A Morris I. Bolsky meillo@20: %A David G. Korn meillo@20: %I Prentice Hall meillo@20: %D 1989 meillo@20: %O \s-1ISBN\s0: 0-13-516972-0 meillo@20: .] meillo@20: They demonstrated, in one chapter of the book, a basic implementation meillo@20: of a subset of \s-1MH\s0 in ksh scripts. meillo@20: Of course, this was just a demonstration, but a brilliant one. meillo@20: It shows how quickly one can implement such a prototype with shell scripts, meillo@20: and how readable they are. meillo@20: The implementation in the scripting language may not be very fast, meillo@20: but it can be fast enough though, and this is all that matters. meillo@20: By having the code in an interpreted language, like the shell, meillo@20: portability becomes a minor issue, if we assume the interpreter meillo@20: to be widespread. meillo@20: This demonstration also shows how easy it is to create single programs meillo@20: of a toolchest software. meillo@23: Most of them comprise less than a hundred lines of shell code. meillo@20: Such small software is easy to write, easy to understand, meillo@20: and thus easy to maintain. meillo@23: A toolchest improves the possibility to only write some parts meillo@20: and though create a working result. meillo@20: Expanding the toolchest without global changes will likely be meillo@20: possible, too. meillo@20: meillo@20: .PP meillo@20: .I "``Use software leverage to your advantage'' meillo@20: and the lesser tenet meillo@20: .I "``allow the user to tailor the environment'' meillo@20: are ideally followed in the design of \s-1MH\s0. meillo@21: Tailoring the environment is heavily encouraged by the ability to meillo@21: directly define default options to programs, even different ones meillo@21: depending on the name under which the program was called. meillo@21: Software leverage is heavily encouraged by the ease it is to meillo@21: create shell scripts that run a specific command line, meillo@21: build of several \s-1MH\s0 programs. meillo@21: There is few software that so much wants users to tailor their meillo@21: environment and to leverage the use of the software, like \s-1MH\s0. meillo@21: Just to make one example: meillo@23: One might prefer a different listing format for the \f(CWscan\fP meillo@21: program. meillo@21: It is possible to take one of the other distributed format files meillo@21: or to write one yourself. meillo@21: To use the format as default for \f(CWscan\fP, a single line, meillo@21: reading meillo@21: .DS meillo@21: .CW meillo@21: scan: -form FORMATFILE meillo@21: .DE meillo@21: must be added to \f(CW.mh_profile\fP. meillo@21: If one wants this different format as an additional command, meillo@23: instead of changing the default, he needs to create a link to meillo@23: \f(CWscan\fP, for instance titled \f(CWscan2\fP. meillo@21: The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP, meillo@21: as the option should only be in effect when scan was called as meillo@21: \f(CWscan2\fP. meillo@20: meillo@20: .PP meillo@21: .I "``Make every program a filter'' meillo@21: is hard to find in \s-1MH\s0. meillo@21: The reason therefore is that most of \s-1MH\s0's tools provide meillo@21: basic file system operations for the mailboxes. meillo@21: \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP, and \f(CWrm\fP meillo@21: aren't filters neither. meillo@23: However, they build a basis on which filters can operate. meillo@23: \s-1MH\s0 does not provide many filters itself, but it is a basis meillo@23: to write filters for. meillo@21: meillo@21: .PP meillo@21: .I "``Build a prototype as soon as possible'' meillo@21: was again well followed by \s-1MH\s0. meillo@21: This tenet, of course, focuses on early development, which is meillo@21: long time ago for \s-1MH\s0. meillo@21: But without following this guideline at the very beginning, meillo@23: Bruce Borden may have not convinced the management of \s-1RAND\s0 meillo@23: to ever create \s-1MH\s0. meillo@23: In Bruce' own words: meillo@21: .QP meillo@21: [...] but [Stockton Gaines and Norm Shapiro] were not able meillo@23: to convince anyone that such a system would be fast enough to be usable. meillo@21: I proposed a very short project to prove the basic concepts, meillo@21: and my management agreed. meillo@21: Looking back, I realize that I had been very lucky with my first design. meillo@21: Without nearly enough design work, meillo@21: I built a working environment and some header files meillo@21: with key structures and wrote the first few \s-1MH\s0 commands: meillo@21: inc, show/next/prev, and comp. meillo@21: [...] meillo@21: With these three, I was able to convince people that the structure was viable. meillo@21: This took about three weeks. meillo@21: .[ meillo@21: %O FIXME meillo@21: .] meillo@0: meillo@0: .NH 2 meillo@0: Problems meillo@0: .LP meillo@22: \s-1MH\s0, for sure is not without problems. meillo@22: There are two main problems: one technical, the other about human behavior. meillo@22: .PP meillo@22: \s-1MH\s0 is old and email today is very different to email in the time meillo@22: when \s-1MH\s0 was designed. meillo@22: \s-1MH\s0 adopted to the changes pretty well, but it is limited. meillo@22: For example in development resources. meillo@22: \s-1MIME\s0 support and support for different character encodings meillo@22: is available, but only on a moderate level. meillo@22: More active developers could quickly improve there. meillo@22: It is also limited by design, which is the larger problem. meillo@22: \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extend. meillo@22: These design conflicts are not easily solvable. meillo@22: Possibly, they require a redesign. meillo@22: .PP meillo@22: The other kind of problem is human habits. meillo@22: When in this world almost all \s-1MUA\s0s are monolithic, meillo@22: it is very difficult to convince people to use a toolbox style \s-1MUA\s0 meillo@22: like \s-1MH\s0. meillo@22: The habits are so strong, that even people who understood the concept meillo@22: and advantages of \s-1MH\s0 do not like to switch. meillo@22: Unfortunately, the frontends to \s-1MH\s0, which can provide familiar look'n'feel, meillo@22: are not very appealing in contrast to what monolithic \s-1MUA\s0s offer. meillo@20: meillo@20: .NH 2 meillo@20: Summary \s-1MH\s0 meillo@20: .LP meillo@20: flexibility, no redundancy, use the shell meillo@0: meillo@8: meillo@8: meillo@0: .NH 1 meillo@0: Case study: uzbl meillo@0: meillo@0: .NH 2 meillo@0: History meillo@0: .LP meillo@0: uzbl is young meillo@0: meillo@0: .NH 2 meillo@0: Contrasts to similar sw meillo@0: .LP meillo@0: like with nmh meillo@0: .LP meillo@0: addons, plugins, modules meillo@0: meillo@0: .NH 2 meillo@0: Gains of the design meillo@0: .LP meillo@0: meillo@0: .NH 2 meillo@0: Problems meillo@0: .LP meillo@0: broken web meillo@0: meillo@8: meillo@8: meillo@0: .NH 1 meillo@0: Final thoughts meillo@0: meillo@0: .NH 2 meillo@0: Quick summary meillo@0: .LP meillo@0: good design meillo@0: .LP meillo@0: unix phil meillo@0: .LP meillo@0: case studies meillo@0: meillo@0: .NH 2 meillo@0: Why people should choose meillo@0: .LP meillo@0: Make the right choice! meillo@0: meillo@0: .nr PI .5i meillo@0: .rm ]< meillo@0: .de ]< meillo@0: .LP meillo@0: .de FP meillo@0: .IP \\\\$1. meillo@0: \\.. meillo@0: .rm FS FE meillo@0: .. meillo@0: .SH meillo@0: References meillo@0: .[ meillo@0: $LIST$ meillo@0: .] meillo@0: .wh -1p