meillo@36: .nr PS 11 meillo@36: .nr VS 13 meillo@0: .nr lu 0 meillo@0: .de CW meillo@0: .nr PQ \\n(.f meillo@0: .if t .ft CW meillo@17: .ie ^\\$1^^ .if n .ul 999 meillo@0: .el .if n .ul 1 meillo@17: .if t .if !^\\$1^^ \&\\$1\f\\n(PQ\\$2 meillo@0: .if n .if \\n(.$=1 \&\\$1 meillo@0: .if n .if \\n(.$>1 \&\\$1\c meillo@0: .if n .if \\n(.$>1 \&\\$2 meillo@0: .. meillo@0: .ds [. \ [ meillo@0: .ds .] ] meillo@1: .\"---------------------------------------- meillo@0: .TL meillo@6: Why the Unix Philosophy still matters meillo@0: .AU meillo@0: markus schnalke meillo@0: .AB meillo@1: .ti \n(.iu meillo@39: This paper explains the importance of the Unix Philosophy for software design. meillo@0: Today, few software designers are aware of these concepts, meillo@39: and thus a lot of modern software is more limited than necessary meillo@39: and makes less use of software leverage than possible. meillo@38: Knowing and following the guidelines of the Unix Philosophy makes software more valuable. meillo@0: .AE meillo@0: meillo@2: .FS meillo@2: .ps -1 meillo@39: This paper was prepared for the ``Software Analysis'' seminar at University Ulm. meillo@39: Mentor was professor Schweiggert. 2010-04-05 meillo@2: .br meillo@39: You may retrieve this document from meillo@39: .CW \s-1http://marmaro.de/docs \ . meillo@2: .FE meillo@2: meillo@0: .NH 1 meillo@0: Introduction meillo@0: .LP meillo@40: The Unix Philosophy is the essence of how the Unix operating system, meillo@40: especially its toolchest, was designed. meillo@40: It is no limited set of fixed rules, meillo@40: but a loose set of guidelines which tell how to write software that meillo@40: suites well into Unix. meillo@40: Actually, the Unix Philosophy describes what is common to typical Unix software. meillo@40: The Wikipedia has an accurate definition: meillo@40: .[ meillo@40: %A Wikipedia meillo@40: %T Unix philosophy meillo@40: %P Wikipedia, The Free Encyclopedia meillo@40: %D 2010-03-21 17:20 UTC meillo@40: %O .CW \s-1http://en.wikipedia.org/w/index.php?title=Unix_philosophy&oldid=351189719 meillo@40: .] meillo@40: .QP meillo@41: .ps -1 meillo@40: The \fIUnix philosophy\fP is a set of cultural norms and philosophical meillo@40: approaches to developing software based on the experience of leading meillo@40: developers of the Unix operating system. meillo@1: .PP meillo@40: As there is no single definition of the Unix Philosophy, meillo@40: several people have stated their view on what it comprises. meillo@1: Best known are: meillo@1: .IP \(bu meillo@1: Doug McIlroy's summary: ``Write programs that do one thing and do it well.'' meillo@1: .[ meillo@1: %A M. D. McIlroy meillo@1: %A E. N. Pinson meillo@1: %A B. A. Taque meillo@1: %T UNIX Time-Sharing System Forward meillo@1: %J The Bell System Technical Journal meillo@1: %D 1978 meillo@1: %V 57 meillo@1: %N 6 meillo@1: %P 1902 meillo@1: .] meillo@1: .IP \(bu meillo@1: Mike Gancarz' book ``The UNIX Philosophy''. meillo@1: .[ meillo@1: %A Mike Gancarz meillo@1: %T The UNIX Philosophy meillo@1: %D 1995 meillo@1: %I Digital Press meillo@1: .] meillo@1: .IP \(bu meillo@1: Eric S. Raymond's book ``The Art of UNIX Programming''. meillo@1: .[ meillo@1: %A Eric S. Raymond meillo@1: %T The Art of UNIX Programming meillo@1: %D 2003 meillo@1: %I Addison-Wesley meillo@2: %O .CW \s-1http://www.faqs.org/docs/artu/ meillo@1: .] meillo@0: .LP meillo@1: These different views on the Unix Philosophy have much in common. meillo@40: Especially, the main concepts are similar in all of them. meillo@40: McIlroy's definition can surely be called the core of the Unix Philosophy, meillo@40: but the fundamental idea behind it all, is ``small is beautiful''. meillo@40: meillo@40: .PP meillo@40: The Unix Philosophy tells how to design and write good software for Unix. meillo@40: Many concepts described here base on facilities of Unix. meillo@40: Other operating systems may not offer such facilities, meillo@41: hence it may not be possible to design software in the way of the meillo@41: Unix Philosophy for them. meillo@40: .PP meillo@41: The Unix Philosophy has an idea of how the process of software development meillo@41: should look like, but large parts of the philosophy are quite independent meillo@41: from the development process used. meillo@41: However, one will soon recognize that some development processes work well meillo@41: with the ideas of the Unix Philosophy and support them, while others are meillo@41: at cross-purposes. meillo@41: Kent Beck's books about Extreme Programming are valuable supplimental meillo@41: resources. meillo@1: .PP meillo@41: The question of how to actually write code and how the code should looks meillo@41: like internally, are out of focus here. meillo@41: ``The Practice of Programming'' by Kernighan and Pike, meillo@41: .[ meillo@41: %A Brian W. Kernighan meillo@41: %A Rob Pike meillo@41: %T The Practice of Programming meillo@41: %I Addison-Wesley meillo@41: %D 1999 meillo@41: .] meillo@41: is a good book that covers this topic. meillo@41: Its point of view matches to the one of this paper. meillo@0: meillo@0: .NH 1 meillo@6: Importance of software design in general meillo@0: .LP meillo@40: Software design is the planning of how the internal structure meillo@40: and external interfaces of a software should look like. meillo@39: It has nothing to do with visual appearance. meillo@39: If we take a program as a car, then its color is of no matter. meillo@39: Its design would be the car's size, its shape, the locations of doors, meillo@39: the passenger/space ratio, the luggage capacity, and so forth. meillo@39: .PP meillo@39: Why should software get designed at all? meillo@6: It is general knowledge, that even a bad plan is better than no plan. meillo@39: Not designing software means programming without plan. meillo@39: This will pretty sure lead to horrible results. meillo@39: Horrible to use and horrible to maintain. meillo@39: These two aspects are the visible ones. meillo@39: Often invisible are the wasted possible gains. meillo@39: Good software design can make these gains available. meillo@2: .PP meillo@39: A software's design deals with quality properties. meillo@39: Good design leads to good quality, and quality is important. meillo@39: Any car may be able to drive from A to B, meillo@39: but it depends on the car's properties whether it is a good choice meillo@39: for passenger transport or not. meillo@39: It depends on its properties if it is a good choice meillo@39: for a rough mountain area. meillo@39: And it depends on its properties if the ride will be fun. meillo@39: meillo@2: .PP meillo@39: Requirements for a software are twofold: meillo@39: functional and non-functional. meillo@39: .IP \(bu meillo@39: Functional requirements define directly the software's functions. meillo@39: They are the reason why software gets written. meillo@39: Someone has a problem and needs a tool to solve it. meillo@39: Being able to solve the problem is the main functional goal. meillo@39: It is the driving force behind all programming effort. meillo@39: Functional requirements are easier to define and to verify. meillo@39: .IP \(bu meillo@39: Non-functional requirements are also called \fIquality\fP requirements. meillo@39: The quality of a software are the properties that are not directly related to meillo@39: the software's basic functions. meillo@39: Tools of bad quality often solve the problems they were written for, meillo@39: but introduce problems and difficulties for usage and development, later on. meillo@39: Quality aspects are often overlooked at first sight, meillo@39: and they are often difficult to define clearly and to verify. meillo@2: .PP meillo@39: Quality is of few matter when the software gets built initially, meillo@39: but it is of matter for usage and maintenance of the software. meillo@6: A short-sighted might see in developing a software mainly building something up. meillo@39: But experience shows, that building the software the first time is meillo@39: only a small amount of the overall work. meillo@39: Bug fixing, extending, rebuilding of parts meillo@39: \(en maintenance work, for short \(en meillo@6: does soon take over the major part of the time spent on a software. meillo@6: Not to forget the usage of the software. meillo@6: These processes are highly influenced by the software's quality. meillo@39: Thus, quality must not be neglected. meillo@39: The problem with quality is that you hardly ``stumble over'' meillo@39: bad quality during the first build, meillo@6: but this is the time when you should care about good quality most. meillo@6: .PP meillo@39: Software design is less the basic function of a software \(en meillo@39: this requirement will get satisfied anyway. meillo@39: Software design is more about quality aspects of the software. meillo@39: Good design leads to good quality, bad design to bad quality. meillo@6: The primary functions of the software will be affected modestly by bad quality, meillo@39: but good quality can provide a lot of additional gain, meillo@6: even at places where one never expected it. meillo@6: .PP meillo@6: The ISO/IEC 9126-1 standard, part 1, meillo@6: .[ meillo@9: %I International Organization for Standardization meillo@6: %T ISO Standard 9126: Software Engineering \(en Product Quality, part 1 meillo@6: %C Geneve meillo@6: %D 2001 meillo@6: .] meillo@6: defines the quality model as consisting out of: meillo@6: .IP \(bu meillo@6: .I Functionality meillo@6: (suitability, accuracy, inter\%operability, security) meillo@6: .IP \(bu meillo@6: .I Reliability meillo@6: (maturity, fault tolerance, recoverability) meillo@6: .IP \(bu meillo@6: .I Usability meillo@6: (understandability, learnability, operability, attractiveness) meillo@6: .IP \(bu meillo@6: .I Efficiency meillo@9: (time behavior, resource utilization) meillo@6: .IP \(bu meillo@6: .I Maintainability meillo@23: (analyzability, changeability, stability, testability) meillo@6: .IP \(bu meillo@6: .I Portability meillo@6: (adaptability, installability, co-existence, replaceability) meillo@6: .LP meillo@39: Good design can improve these properties of a software, meillo@39: bad designed software probably suffers from not having them. meillo@7: .PP meillo@7: One further goal of software design is consistency. meillo@7: Consistency eases understanding, working on, and using things. meillo@39: Consistent internal structure and consistent interfaces to the outside meillo@39: can be provided by good design. meillo@7: .PP meillo@39: Software should be well designed because good design avoids many meillo@39: problems during the software's lifetime. meillo@39: And software should be well designed because good design can offer meillo@39: much additional gain. meillo@39: Indeed, much effort should be spent into good design to make software more valuable. meillo@39: The Unix Philosophy shows a way of how to design software well. meillo@7: It offers guidelines to achieve good quality and high gain for the effort spent. meillo@0: meillo@0: meillo@0: .NH 1 meillo@0: The Unix Philosophy meillo@4: .LP meillo@4: The origins of the Unix Philosophy were already introduced. meillo@8: This chapter explains the philosophy, oriented on Gancarz, meillo@8: and shows concrete examples of its application. meillo@5: meillo@16: .NH 2 meillo@14: Pipes meillo@4: .LP meillo@4: Following are some examples to demonstrate how applied Unix Philosophy feels like. meillo@4: Knowledge of using the Unix shell is assumed. meillo@4: .PP meillo@4: Counting the number of files in the current directory: meillo@41: .DS meillo@4: .CW meillo@9: .ps -1 meillo@4: ls | wc -l meillo@4: .DE meillo@4: The meillo@4: .CW ls meillo@4: command lists all files in the current directory, one per line, meillo@4: and meillo@4: .CW "wc -l meillo@8: counts the number of lines. meillo@4: .PP meillo@8: Counting the number of files that do not contain ``foo'' in their name: meillo@41: .DS meillo@4: .CW meillo@9: .ps -1 meillo@4: ls | grep -v foo | wc -l meillo@4: .DE meillo@4: Here, the list of files is filtered by meillo@4: .CW grep meillo@4: to remove all that contain ``foo''. meillo@4: The rest is the same as in the previous example. meillo@4: .PP meillo@4: Finding the five largest entries in the current directory. meillo@41: .DS meillo@4: .CW meillo@9: .ps -1 meillo@4: du -s * | sort -nr | sed 5q meillo@4: .DE meillo@4: .CW "du -s * meillo@4: returns the recursively summed sizes of all files meillo@8: \(en no matter if they are regular files or directories. meillo@4: .CW "sort -nr meillo@4: sorts the list numerically in reverse order. meillo@4: Finally, meillo@4: .CW "sed 5q meillo@4: quits after it has printed the fifth line. meillo@4: .PP meillo@4: The presented command lines are examples of what Unix people would use meillo@4: to get the desired output. meillo@4: There are also other ways to get the same output. meillo@4: It's a user's decision which way to go. meillo@14: .PP meillo@8: The examples show that many tasks on a Unix system meillo@4: are accomplished by combining several small programs. meillo@4: The connection between the single programs is denoted by the pipe operator `|'. meillo@4: .PP meillo@4: Pipes, and their extensive and easy use, are one of the great meillo@4: achievements of the Unix system. meillo@4: Pipes between programs have been possible in earlier operating systems, meillo@4: but it has never been a so central part of the concept. meillo@4: When, in the early seventies, Doug McIlroy introduced pipes for the meillo@4: Unix system, meillo@4: ``it was this concept and notation for linking several programs together meillo@4: that transformed Unix from a basic file-sharing system to an entirely new way of computing.'' meillo@4: .[ meillo@4: %T Unix: An Oral History meillo@5: %O .CW \s-1http://www.princeton.edu/~hos/frs122/unixhist/finalhis.htm meillo@4: .] meillo@4: .PP meillo@4: Being able to specify pipelines in an easy way is, meillo@4: however, not enough by itself. meillo@5: It is only one half. meillo@4: The other is the design of the programs that are used in the pipeline. meillo@8: They have to interfaces that allows them to be used in such a way. meillo@5: meillo@16: .NH 2 meillo@14: Interface design meillo@5: .LP meillo@11: Unix is, first of all, simple \(en Everything is a file. meillo@5: Files are sequences of bytes, without any special structure. meillo@5: Programs should be filters, which read a stream of bytes from ``standard input'' (stdin) meillo@5: and write a stream of bytes to ``standard output'' (stdout). meillo@5: .PP meillo@8: If the files \fIare\fP sequences of bytes, meillo@8: and the programs \fIare\fP filters on byte streams, meillo@11: then there is exactly one standardized data interface. meillo@5: Thus it is possible to combine them in any desired way. meillo@5: .PP meillo@5: Even a handful of small programs will yield a large set of combinations, meillo@5: and thus a large set of different functions. meillo@5: This is leverage! meillo@5: If the programs are orthogonal to each other \(en the best case \(en meillo@5: then the set of different functions is greatest. meillo@5: .PP meillo@11: Programs might also have a separate control interface, meillo@11: besides their data interface. meillo@11: The control interface is often called ``user interface'', meillo@11: because it is usually designed to be used by humans. meillo@11: The Unix Philosophy discourages to assume the user to be human. meillo@11: Interactive use of software is slow use of software, meillo@11: because the program waits for user input most of the time. meillo@11: Interactive software requires the user to be in front of the computer meillo@11: all the time. meillo@11: Interactive software occupy the user's attention while they are running. meillo@11: .PP meillo@11: Now we come back to the idea of using several small programs, combined, meillo@11: to have a more specific function. meillo@11: If these single tools would all be interactive, meillo@11: how would the user control them? meillo@11: It is not only a problem to control several programs at once if they run at the same time, meillo@11: it also very inefficient to have to control each of the single programs meillo@11: that are intended to work as one large program. meillo@11: Hence, the Unix Philosophy discourages programs to demand interactive use. meillo@11: The behavior of programs should be defined at invocation. meillo@11: This is done by specifying arguments (``command line switches'') to the program call. meillo@11: Gancarz discusses this topic as ``avoid captive user interfaces''. meillo@11: .[ meillo@11: %A Mike Gancarz meillo@11: %T The UNIX Philosophy meillo@11: %I Digital Press meillo@11: %D 1995 meillo@11: %P 88 ff. meillo@11: .] meillo@11: .PP meillo@11: Non-interactive use is, during development, also an advantage for testing. meillo@11: Testing of interactive programs is much more complicated, meillo@11: than testing of non-interactive programs. meillo@5: meillo@16: .NH 2 meillo@8: The toolchest approach meillo@5: .LP meillo@5: A toolchest is a set of tools. meillo@5: Instead of having one big tool for all tasks, one has many small tools, meillo@5: each for one task. meillo@5: Difficult tasks are solved by combining several of the small, simple tools. meillo@5: .PP meillo@11: The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs meillo@11: that are filters on byte streams. meillo@11: They are, to a large extend, unrelated in their function. meillo@11: Hence, the Unix toolchest provides a large set of functions meillo@11: that can be accessed by combining the programs in the desired way. meillo@11: .PP meillo@11: There are also advantages for developing small toolchest programs. meillo@5: It is easier and less error-prone to write small programs. meillo@5: It is also easier and less error-prone to write a large set of small programs, meillo@5: than to write one large program with all the functionality included. meillo@5: If the small programs are combinable, then they offer even a larger set meillo@5: of functions than the single large program. meillo@5: Hence, one gets two advantages out of writing small, combinable programs. meillo@5: .PP meillo@5: There are two drawbacks of the toolchest approach. meillo@8: First, one simple, standardized, unidirectional interface has to be sufficient. meillo@5: If one feels the need for more ``logic'' than a stream of bytes, meillo@8: then a different approach might be of need. meillo@13: But it is also possible, that he just can not imagine a design where meillo@8: a stream of bytes is sufficient. meillo@8: By becoming more familiar with the ``Unix style of thinking'', meillo@8: developers will more often and easier find simple designs where meillo@8: a stream of bytes is a sufficient interface. meillo@8: .PP meillo@8: The second drawback of a toolchest affects the users. meillo@5: A toolchest is often more difficult to use for novices. meillo@9: It is necessary to become familiar with each of the tools, meillo@5: to be able to use the right one in a given situation. meillo@9: Additionally, one needs to combine the tools in a senseful way on its own. meillo@9: This is like a sharp knife \(en it is a powerful tool in the hand of a master, meillo@5: but of no good value in the hand of an unskilled. meillo@5: .PP meillo@8: However, learning single, small tool of the toolchest is easier than meillo@8: learning a complex tool. meillo@8: The user will have a basic understanding of a yet unknown tool, meillo@8: if the several tools of the toolchest have a common style. meillo@8: He will be able to transfer knowledge over one tool to another. meillo@8: .PP meillo@8: Moreover, the second drawback can be removed easily by adding wrappers meillo@8: around the single tools. meillo@5: Novice users do not need to learn several tools if a professional wraps meillo@8: the single commands into a more high-level script. meillo@5: Note that the wrapper script still calls the small tools; meillo@5: the wrapper script is just like a skin around. meillo@8: No complexity is added this way, meillo@8: but new programs can get created out of existing one with very low effort. meillo@5: .PP meillo@5: A wrapper script for finding the five largest entries in the current directory meillo@5: could look like this: meillo@41: .DS meillo@5: .CW meillo@9: .ps -1 meillo@5: #!/bin/sh meillo@5: du -s * | sort -nr | sed 5q meillo@5: .DE meillo@5: The script itself is just a text file that calls the command line meillo@5: a professional user would type in directly. meillo@8: Making the program flexible on the number of entries it prints, meillo@8: is easily possible: meillo@41: .DS meillo@8: .CW meillo@9: .ps -1 meillo@8: #!/bin/sh meillo@8: num=5 meillo@8: [ $# -eq 1 ] && num="$1" meillo@8: du -sh * | sort -nr | sed "${num}q" meillo@8: .DE meillo@8: This script acts like the one before, when called without an argument. meillo@8: But one can also specify a numerical argument to define the number of lines to print. meillo@5: meillo@16: .NH 2 meillo@8: A powerful shell meillo@8: .LP meillo@10: It was already said, that the Unix shell provides the possibility to meillo@10: combine small programs into large ones easily. meillo@10: A powerful shell is a great feature in other ways, too. meillo@8: .PP meillo@10: For instance by including a scripting language. meillo@10: The control statements are build into the shell. meillo@8: The functions, however, are the normal programs, everyone can use on the system. meillo@10: Thus, the programs are known, so learning to program in the shell is easy. meillo@8: Using normal programs as functions in the shell programming language meillo@10: is only possible because they are small and combinable tools in a toolchest style. meillo@8: .PP meillo@8: The Unix shell encourages to write small scripts out of other programs, meillo@8: because it is so easy to do. meillo@8: This is a great step towards automation. meillo@8: It is wonderful if the effort to automate a task equals the effort meillo@8: it takes to do it the second time by hand. meillo@8: If it is so, then the user will be happy to automate everything he does more than once. meillo@8: .PP meillo@8: Small programs that do one job well, standardized interfaces between them, meillo@8: a mechanism to combine parts to larger parts, and an easy way to automate tasks, meillo@8: this will inevitably produce software leverage. meillo@8: Getting multiple times the benefit of an investment is a great offer. meillo@10: .PP meillo@10: The shell also encourages rapid prototyping. meillo@10: Many well known programs started as quickly hacked shell scripts, meillo@10: and turned into ``real'' programs, written in C, later. meillo@10: Building a prototype first is a way to avoid the biggest problems meillo@10: in application development. meillo@10: Fred Brooks writes in ``No Silver Bullet'': meillo@10: .[ meillo@10: %A Frederick P. Brooks, Jr. meillo@10: %T No Silver Bullet: Essence and Accidents of Software Engineering meillo@10: %B Information Processing 1986, the Proceedings of the IFIP Tenth World Computing Conference meillo@10: %E H.-J. Kugler meillo@10: %D 1986 meillo@10: %P 1069\(en1076 meillo@10: %I Elsevier Science B.V. meillo@10: %C Amsterdam, The Netherlands meillo@10: .] meillo@10: .QP meillo@41: .ps -1 meillo@10: The hardest single part of building a software system is deciding precisely what to build. meillo@10: No other part of the conceptual work is so difficult as establishing the detailed meillo@10: technical requirements, [...]. meillo@10: No other part of the work so cripples the resulting system if done wrong. meillo@10: No other part is more difficult to rectify later. meillo@10: .PP meillo@10: Writing a prototype is a great method to become familiar with the requirements meillo@10: and to actually run into real problems. meillo@10: Today, prototyping is often seen as a first step in building a software. meillo@10: This is, of course, good. meillo@10: However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping: meillo@10: After having built the prototype, one might notice, that the prototype is already meillo@10: \fIgood enough\fP. meillo@10: Hence, no reimplementation, in a more sophisticated programming language, might be of need, meillo@10: for the moment. meillo@23: Maybe later, it might be necessary to rewrite the software, but not now. meillo@10: .PP meillo@10: By delaying further work, one keeps the flexibility to react easily on meillo@10: changing requirements. meillo@10: Software parts that are not written will not miss the requirements. meillo@10: meillo@16: .NH 2 meillo@10: Worse is better meillo@10: .LP meillo@10: The Unix Philosophy aims for the 80% solution; meillo@10: others call it the ``Worse is better'' approach. meillo@10: .PP meillo@10: First, practical experience shows, that it is almost never possible to define the meillo@10: requirements completely and correctly the first time. meillo@10: Hence one should not try to; it will fail anyway. meillo@10: Second, practical experience shows, that requirements change during time. meillo@10: Hence it is best to delay requirement-based design decisions as long as possible. meillo@10: Also, the software should be small and flexible as long as possible meillo@10: to react on changing requirements. meillo@10: Shell scripts, for example, are more easily adjusted as C programs. meillo@10: Third, practical experience shows, that maintenance is hard work. meillo@10: Hence, one should keep the amount of software as small as possible; meillo@10: it should just fulfill the \fIcurrent\fP requirements. meillo@10: Software parts that will be written later, do not need maintenance now. meillo@10: .PP meillo@10: Starting with a prototype in a scripting language has several advantages: meillo@10: .IP \(bu meillo@10: As the initial effort is low, one will likely start right away. meillo@10: .IP \(bu meillo@10: As working parts are available soon, the real requirements can get identified soon. meillo@10: .IP \(bu meillo@10: When a software is usable, it gets used, and thus tested. meillo@10: Hence problems will be found at early stages of the development. meillo@10: .IP \(bu meillo@10: The prototype might be enough for the moment, meillo@10: thus further work on the software can be delayed to a time meillo@10: when one knows better about the requirements and problems, meillo@10: than now. meillo@10: .IP \(bu meillo@10: Implementing now only the parts that are actually needed now, meillo@10: requires fewer maintenance work. meillo@10: .IP \(bu meillo@10: If the global situation changes so that the software is not needed anymore, meillo@10: then less effort was spent into the project, than it would have be meillo@10: when a different approach had been used. meillo@10: meillo@16: .NH 2 meillo@11: Upgrowth and survival of software meillo@11: .LP meillo@12: So far it was talked about \fIwriting\fP or \fIbuilding\fP software. meillo@13: Although these are just verbs, they do imply a specific view on the work process meillo@13: they describe. meillo@12: The better verb, however, is to \fIgrow\fP. meillo@12: .PP meillo@12: Creating software in the sense of the Unix Philosophy is an incremental process. meillo@12: It starts with a first prototype, which evolves as requirements change. meillo@12: A quickly hacked shell script might become a large, sophisticated, meillo@13: compiled program this way. meillo@13: Its lifetime begins with the initial prototype and ends when the software is not used anymore. meillo@13: While being alive it will get extended, rearranged, rebuilt (from scratch). meillo@12: Growing software matches the view that ``software is never finished. It is only released.'' meillo@12: .[ meillo@13: %O FIXME meillo@13: %A Mike Gancarz meillo@13: %T The UNIX Philosophy meillo@13: %P 26 meillo@12: .] meillo@12: .PP meillo@13: Software can be seen as being controlled by evolutionary processes. meillo@13: Successful software is software that is used by many for a long time. meillo@12: This implies that the software is needed, useful, and better than alternatives. meillo@12: Darwin talks about: ``The survival of the fittest.'' meillo@12: .[ meillo@13: %O FIXME meillo@13: %A Charles Darwin meillo@12: .] meillo@12: Transferred to software: The most successful software, is the fittest, meillo@12: is the one that survives. meillo@13: (This may be at the level of one creature, or at the level of one species.) meillo@13: The fitness of software is affected mainly by four properties: meillo@15: portability of code, portability of data, range of usability, and reusability of parts. meillo@15: .\" .IP \(bu meillo@15: .\" portability of code meillo@15: .\" .IP \(bu meillo@15: .\" portability of data meillo@15: .\" .IP \(bu meillo@15: .\" range of usability meillo@15: .\" .IP \(bu meillo@15: .\" reuseability of parts meillo@13: .PP meillo@15: (1) meillo@15: .I "Portability of code meillo@15: means, using high-level programming languages, meillo@13: sticking to the standard, meillo@13: and avoiding optimizations that introduce dependencies on specific hardware. meillo@13: Hardware has a much lower lifetime than software. meillo@13: By chaining software to a specific hardware, meillo@13: the software's lifetime gets shortened to that of this hardware. meillo@13: In contrast, software should be easy to port \(en meillo@23: adaptation is the key to success. meillo@13: .\" cf. practice of prog: ch08 meillo@13: .PP meillo@15: (2) meillo@15: .I "Portability of data meillo@15: is best achieved by avoiding binary representations meillo@13: to store data, because binary representations differ from machine to machine. meillo@23: Textual representation is favored. meillo@13: Historically, ASCII was the charset of choice. meillo@13: In the future, UTF-8 might be the better choice, however. meillo@13: Important is that it is a plain text representation in a meillo@13: very common charset encoding. meillo@13: Apart from being able to transfer data between machines, meillo@13: readable data has the great advantage, that humans are able meillo@13: to directly edit it with text editors and other tools from the Unix toolchest. meillo@13: .\" gancarz tenet 5 meillo@13: .PP meillo@15: (3) meillo@15: A large meillo@15: .I "range of usability meillo@23: ensures good adaptation, and thus good survival. meillo@13: It is a special distinction if a software becomes used in fields of action, meillo@13: the original authors did never imagine. meillo@13: Software that solves problems in a general way will likely be used meillo@13: for all kinds of similar problems. meillo@13: Being too specific limits the range of uses. meillo@13: Requirements change through time, thus use cases change or even vanish. meillo@13: A good example in this point is Allman's sendmail. meillo@13: Allman identifies flexibility to be one major reason for sendmail's success: meillo@13: .[ meillo@13: %O FIXME meillo@13: %A Allman meillo@13: %T sendmail meillo@13: .] meillo@13: .QP meillo@41: .ps -1 meillo@13: Second, I limited myself to the routing function [...]. meillo@13: This was a departure from the dominant thought of the time, [...]. meillo@13: .QP meillo@41: .ps -1 meillo@13: Third, the sendmail configuration file was flexible enough to adopt meillo@13: to a rapidly changing world [...]. meillo@12: .LP meillo@13: Successful software adopts itself to the changing world. meillo@13: .PP meillo@15: (4) meillo@15: .I "Reuse of parts meillo@15: is even one step further. meillo@13: A software may completely lose its field of action, meillo@13: but parts of which the software is build may be general and independent enough meillo@13: to survive this death. meillo@13: If software is build by combining small independent programs, meillo@13: then there are parts readily available for reuse. meillo@13: Who cares if the large program is a failure, meillo@13: but parts of it become successful instead? meillo@10: meillo@16: .NH 2 meillo@14: Summary meillo@0: .LP meillo@14: This chapter explained the central ideas of the Unix Philosophy. meillo@14: For each of the ideas, it was exposed what advantages they introduce. meillo@14: The Unix Philosophy are guidelines that help to write valuable software. meillo@14: From the view point of a software developer or software designer, meillo@14: the Unix Philosophy provides answers to many software design problem. meillo@14: .PP meillo@14: The various ideas of the Unix Philosophy are very interweaved meillo@14: and can hardly be applied independently. meillo@14: However, the probably most important messages are: meillo@14: .I "``Do one thing well!''" , meillo@14: .I "``Keep it simple!''" , meillo@14: and meillo@14: .I "``Use software leverage!'' meillo@0: meillo@8: meillo@8: meillo@0: .NH 1 meillo@19: Case study: \s-1MH\s0 meillo@18: .LP meillo@30: The previous chapter introduced and explained the Unix Philosophy meillo@18: from a general point of view. meillo@30: The driving force were the guidelines; references to meillo@18: existing software were given only sparsely. meillo@18: In this and the next chapter, concrete software will be meillo@18: the driving force in the discussion. meillo@18: .PP meillo@23: This first case study is about the mail user agents (\s-1MUA\s0) meillo@23: \s-1MH\s0 (``mail handler'') and its descendent \fInmh\fP meillo@23: (``new mail handler''). meillo@23: \s-1MUA\s0s provide functions to read, compose, and organize mail, meillo@23: but (ideally) not to transfer. meillo@19: In this document, the name \s-1MH\s0 will be used for both of them. meillo@19: A distinction will only be made if differences between meillo@19: them are described. meillo@18: meillo@0: meillo@0: .NH 2 meillo@19: Historical background meillo@0: .LP meillo@19: Electronic mail was available in Unix very early. meillo@30: The first \s-1MUA\s0 on Unix was \f(CWmail\fP, meillo@30: which was already present in the First Edition. meillo@30: .[ meillo@30: %A Peter H. Salus meillo@30: %T A Quarter Century of UNIX meillo@30: %D 1994 meillo@30: %I Addison-Wesley meillo@30: %P 41 f. meillo@30: .] meillo@30: It was a small program that either prints the user's mailbox file meillo@19: or appends text to someone elses mailbox file, meillo@19: depending on the command line arguments. meillo@19: .[ meillo@19: %O http://cm.bell-labs.com/cm/cs/who/dmr/pdfs/man12.pdf meillo@19: .] meillo@19: It was a program that did one job well. meillo@23: This job was emailing, which was very simple then. meillo@19: .PP meillo@23: Later, emailing became more powerful, and thus more complex. meillo@19: The simple \f(CWmail\fP, which knew nothing of subjects, meillo@19: independent handling of single messages, meillo@19: and long-time storage of them, was not powerful enough anymore. meillo@19: At Berkeley, Kurt Shoens wrote \fIMail\fP (with capital `M') meillo@19: in 1978 to provide additional functions for emailing. meillo@19: Mail was still one program, but now it was large and did meillo@19: several jobs. meillo@23: Its user interface is modeled after the one of \fIed\fP. meillo@19: It is designed for humans, but is still scriptable. meillo@23: \fImailx\fP is the adaptation of Berkeley Mail into System V. meillo@19: .[ meillo@19: %A Gunnar Ritter meillo@19: %O http://heirloom.sourceforge.net/mailx_history.html meillo@19: .] meillo@30: Elm, pine, mutt, and a whole bunch of graphical \s-1MUA\s0s meillo@19: followed Mail's direction. meillo@19: They are large, monolithic programs which include all emailing functions. meillo@19: .PP meillo@23: A different way was taken by the people of \s-1RAND\s0 Corporation. meillo@38: In the beginning, they also had used a monolithic mail system, meillo@30: called \s-1MS\s0 (for ``mail system''). meillo@19: But in 1977, Stockton Gaines and Norman Shapiro meillo@19: came up with a proposal of a new email system concept \(en meillo@19: one that honors the Unix Philosophy. meillo@19: The concept was implemented by Bruce Borden in 1978 and 1979. meillo@19: This was the birth of \s-1MH\s0 \(en the ``mail handler''. meillo@18: .PP meillo@18: Since then, \s-1RAND\s0, the University of California at Irvine and meillo@19: at Berkeley, and several others have contributed to the software. meillo@18: However, it's core concepts remained the same. meillo@23: In the late 90s, when development of \s-1MH\s0 slowed down, meillo@19: Richard Coleman started with \fInmh\fP, the new mail handler. meillo@19: His goal was to improve \s-1MH\s0, especially in regard of meillo@23: the requirements of modern emailing. meillo@19: Today, nmh is developed by various people on the Internet. meillo@18: .[ meillo@18: %T RAND and the Information Evolution: A History in Essays and Vignettes meillo@18: %A Willis H. Ware meillo@18: %D 2008 meillo@18: %I The RAND Corporation meillo@18: %P 128\(en137 meillo@18: %O .CW \s-1http://www.rand.org/pubs/corporate_pubs/CP537/ meillo@18: .] meillo@18: .[ meillo@18: %T MH & xmh: Email for Users & Programmers meillo@18: %A Jerry Peek meillo@18: %D 1991, 1992, 1995 meillo@18: %I O'Reilly & Associates, Inc. meillo@18: %P Appendix B meillo@18: %O Also available online: \f(CW\s-2http://rand-mh.sourceforge.net/book/\fP meillo@18: .] meillo@0: meillo@0: .NH 2 meillo@20: Contrasts to monolithic mail systems meillo@0: .LP meillo@19: All \s-1MUA\s0s are monolithic, except \s-1MH\s0. meillo@38: Although there might actually exist further, very little known, meillo@30: toolchest \s-1MUA\s0s, this statement reflects the situation pretty well. meillo@19: .PP meillo@30: Monolithic \s-1MUA\s0s gather all their functions in one program. meillo@30: In contrast, \s-1MH\s0 is a toolchest of many small tools \(en one for each job. meillo@23: Following is a list of important programs of \s-1MH\s0's toolchest meillo@30: and their function. meillo@30: It gives a feeling of how the toolchest looks like. meillo@19: .IP \(bu meillo@19: .CW inc : meillo@30: incorporate new mail (this is how mail enters the system) meillo@19: .IP \(bu meillo@19: .CW scan : meillo@19: list messages in folder meillo@19: .IP \(bu meillo@19: .CW show : meillo@19: show message meillo@19: .IP \(bu meillo@19: .CW next\fR/\fPprev : meillo@19: show next/previous message meillo@19: .IP \(bu meillo@19: .CW folder : meillo@19: change current folder meillo@19: .IP \(bu meillo@19: .CW refile : meillo@19: refile message into folder meillo@19: .IP \(bu meillo@19: .CW rmm : meillo@19: remove message meillo@19: .IP \(bu meillo@19: .CW comp : meillo@19: compose a new message meillo@19: .IP \(bu meillo@19: .CW repl : meillo@19: reply to a message meillo@19: .IP \(bu meillo@19: .CW forw : meillo@19: forward a message meillo@19: .IP \(bu meillo@19: .CW send : meillo@30: send a prepared message (this is how mail leaves the system) meillo@0: .LP meillo@19: \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have. meillo@19: The user does not leave the shell to run \s-1MH\s0, meillo@30: but he uses the various \s-1MH\s0 programs within the shell. meillo@23: Using a monolithic program with a captive user interface meillo@23: means ``entering'' the program, using it, and ``exiting'' the program. meillo@23: Using toolchests like \s-1MH\s0 means running programs, meillo@38: alone or in combination with others, even from other toolchests, meillo@23: without leaving the shell. meillo@30: meillo@30: .NH 2 meillo@30: Data storage meillo@30: .LP meillo@34: \s-1MH\s0's mail storage is a directory tree under the user's meillo@34: \s-1MH\s0 directory (usually \f(CW$HOME/Mail\fP), meillo@34: where mail folders are directories and mail messages are text files meillo@34: within them. meillo@34: Each mail folder contains a file \f(CW.mh_sequences\fP which lists meillo@34: the public message sequences of that folder, for instance new messages. meillo@34: Mail messages are text files located in a mail folder. meillo@34: The files contain the messages as they were received. meillo@34: They are numbered in ascending order in each folder. meillo@19: .PP meillo@30: This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0. meillo@30: Alternatives are \fImbox\fP and \fImaildir\fP. meillo@30: In the mbox format all messages are stored within one file. meillo@30: This was a good solution in the early days, when messages meillo@30: were only a few lines of text and were deleted soon. meillo@30: Today, when single messages often include several megabytes meillo@30: of attachments, it is a bad solution. meillo@30: Another disadvantage of the mbox format is that it is meillo@30: more difficult to write tools that work on mail messages, meillo@30: because it is always necessary to first find and extract meillo@30: the relevant message in the mbox file. meillo@30: With the \s-1MH\s0 mailbox format, meillo@30: each message is a self-standing item, by definition. meillo@30: Also, the problem of concurrent access to one mailbox is meillo@30: reduced to the problem of concurrent access to one message. meillo@30: Maildir is generally similar to \s-1MH\s0's format, meillo@30: but modified towards guaranteed reliability. meillo@30: This involves some complexity, unfortunately. meillo@34: .PP meillo@34: Working with \s-1MH\s0's toolchest on mailboxes is much like meillo@34: working with Unix' toolchest on directory trees: meillo@34: \f(CWscan\fP is like \f(CWls\fP, meillo@34: \f(CWshow\fP is like \f(CWcat\fP, meillo@34: \f(CWfolder\fP is like \f(CWcd\fP and \f(CWpwd\fP, meillo@34: \f(CWrefile\fP is like \f(CWmv\fP, meillo@34: and \f(CWrmm\fP is like \f(CWrm\fP. meillo@34: .PP meillo@34: The context of tools in Unix consists mainly the current working directory, meillo@34: the user identification, and the environment variables. meillo@34: \s-1MH\s0 extends this context by two more items: meillo@34: .IP \(bu meillo@34: The current mail folder, which is similar to the current working directory. meillo@34: For mail folders, \f(CWfolder\fP provides the corresponding functionality meillo@34: of \f(CWcd\fP and \f(CWpwd\fP for directories. meillo@34: .IP \(bu meillo@34: Sequences, which are named sets of messages in a mail folder. meillo@34: The current message, relative to a mail folder, is a special sequence. meillo@34: It enables commands like \f(CWnext\fP and \f(CWprev\fP. meillo@34: .LP meillo@34: In contrast to Unix' context, which is chained to the shell session, meillo@34: \s-1MH\s0's context is independent. meillo@34: Usually there is one context for each user, but a user can have many meillo@34: contexts. meillo@34: Public sequences are an exception, as they belong to the mail folder. meillo@34: .[ meillo@34: %O mh-profile(5) and mh-sequence(5) meillo@34: .] meillo@20: meillo@0: .NH 2 meillo@20: Discussion of the design meillo@0: .LP meillo@20: The following paragraphs discuss \s-1MH\s0 in regard to the tenets meillo@23: of the Unix Philosophy which Gancarz identified. meillo@20: meillo@20: .PP meillo@33: .B "Small is beautiful meillo@20: and meillo@33: .B "do one thing well meillo@20: are two design goals that are directly visible in \s-1MH\s0. meillo@20: Gancarz actually presents \s-1MH\s0 as example under the headline meillo@20: ``Making UNIX Do One Thing Well'': meillo@41: .[ meillo@41: %A Mike Gancarz meillo@41: %T unix-phil meillo@41: %P 125 meillo@41: .] meillo@20: .QP meillo@41: .ps -1 meillo@20: [\s-1MH\s0] consists of a series of programs which meillo@20: when combined give the user an enormous ability meillo@20: to manipulate electronic mail messages. meillo@20: A complex application, it shows that not only is it meillo@20: possible to build large applications from smaller meillo@20: components, but also that such designs are actually preferable. meillo@20: .LP meillo@20: The various small programs of \s-1MH\s0 were relatively easy meillo@23: to write, because each of them is small, limited to one function, meillo@23: and has clear boundaries. meillo@20: For the same reasons, they are also good to maintain. meillo@20: Further more, the system can easily get extended. meillo@20: One only needs to put a new program into the toolchest. meillo@23: This was done, for instance, when \s-1MIME\s0 support was added meillo@20: (e.g. \f(CWmhbuild\fP). meillo@20: Also, different programs can exist to do the basically same job meillo@20: in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP). meillo@20: If someone needs a mail system with some additionally meillo@23: functions that are available nowhere yet, he best takes a meillo@20: toolchest system like \s-1MH\s0 where he can add the meillo@20: functionality with little work. meillo@20: meillo@20: .PP meillo@34: .B "Store data in flat text files meillo@34: is followed by \s-1MH\s0. meillo@34: This is not surprising, because email messages are already plain text. meillo@34: \s-1MH\s0 stores the messages as it receives them, meillo@34: thus any other tool that works on RFC 2822 mail messages can operate meillo@34: on the messages in an \s-1MH\s0 mailbox. meillo@34: All other files \s-1MH\s0 uses are plain text too. meillo@34: It is therefore possible and encouraged to use the text processing meillo@34: tools of Unix' toolchest to extend \s-1MH\s0's toolchest. meillo@20: meillo@20: .PP meillo@33: .B "Avoid captive user interfaces" . meillo@19: \s-1MH\s0 is perfectly suited for non-interactive use. meillo@19: It offers all functions directly and without captive user interfaces. meillo@30: If, nonetheless, users want a graphical user interface, meillo@20: they can have it with \fIxmh\fP or \fIexmh\fP, too. meillo@19: These are graphical frontends for the \s-1MH\s0 toolchest. meillo@19: This means, all email-related work is still done by \s-1MH\s0 tools, meillo@20: but the frontend issues the appropriate calls when the user meillo@30: clicks on buttons. meillo@20: Providing easy-to-use user interfaces in form of frontends is a good meillo@19: approach, because it does not limit the power of the backend itself. meillo@20: The frontend will anyway only be able to make a subset of the meillo@23: backend's power and flexibility available to the user. meillo@20: But if it is a separate program, meillo@20: then the missing parts can still be accessed at the backend directly. meillo@19: If it is integrated, then this will hardly be possible. meillo@30: Further more, it is possible to have different frontends to the same meillo@30: backend. meillo@19: meillo@19: .PP meillo@33: .B "Choose portability over efficiency meillo@20: and meillo@33: .B "use shell scripts to increase leverage and portability" . meillo@20: These two tenets are indirectly, but nicely, demonstrated by meillo@30: Bolsky and Korn in their book about the Korn Shell. meillo@20: .[ meillo@20: %T The KornShell: command and programming language meillo@20: %A Morris I. Bolsky meillo@20: %A David G. Korn meillo@20: %I Prentice Hall meillo@20: %D 1989 meillo@30: %P 254\(en290 meillo@20: %O \s-1ISBN\s0: 0-13-516972-0 meillo@20: .] meillo@30: They demonstrated, in chapter 18 of the book, a basic implementation meillo@20: of a subset of \s-1MH\s0 in ksh scripts. meillo@20: Of course, this was just a demonstration, but a brilliant one. meillo@20: It shows how quickly one can implement such a prototype with shell scripts, meillo@20: and how readable they are. meillo@20: The implementation in the scripting language may not be very fast, meillo@20: but it can be fast enough though, and this is all that matters. meillo@20: By having the code in an interpreted language, like the shell, meillo@20: portability becomes a minor issue, if we assume the interpreter meillo@20: to be widespread. meillo@20: This demonstration also shows how easy it is to create single programs meillo@20: of a toolchest software. meillo@30: There are eight tools (two of them have multiple names) and 16 functions meillo@30: with supporting code. meillo@30: Each tool comprises between 12 and 38 lines of ksh, meillo@30: in total about 200 lines. meillo@30: The functions comprise between 3 and 78 lines of ksh, meillo@30: in total about 450 lines. meillo@20: Such small software is easy to write, easy to understand, meillo@20: and thus easy to maintain. meillo@23: A toolchest improves the possibility to only write some parts meillo@20: and though create a working result. meillo@20: Expanding the toolchest without global changes will likely be meillo@20: possible, too. meillo@20: meillo@20: .PP meillo@33: .B "Use software leverage to your advantage meillo@20: and the lesser tenet meillo@33: .B "allow the user to tailor the environment meillo@20: are ideally followed in the design of \s-1MH\s0. meillo@21: Tailoring the environment is heavily encouraged by the ability to meillo@30: directly define default options to programs. meillo@30: It is even possible to define different default options meillo@21: depending on the name under which the program was called. meillo@21: Software leverage is heavily encouraged by the ease it is to meillo@21: create shell scripts that run a specific command line, meillo@30: built of several \s-1MH\s0 programs. meillo@21: There is few software that so much wants users to tailor their meillo@21: environment and to leverage the use of the software, like \s-1MH\s0. meillo@21: Just to make one example: meillo@23: One might prefer a different listing format for the \f(CWscan\fP meillo@21: program. meillo@30: It is possible to take one of the distributed format files meillo@21: or to write one yourself. meillo@21: To use the format as default for \f(CWscan\fP, a single line, meillo@21: reading meillo@21: .DS meillo@21: .CW meillo@21: scan: -form FORMATFILE meillo@21: .DE meillo@21: must be added to \f(CW.mh_profile\fP. meillo@21: If one wants this different format as an additional command, meillo@23: instead of changing the default, he needs to create a link to meillo@23: \f(CWscan\fP, for instance titled \f(CWscan2\fP. meillo@21: The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP, meillo@30: as the option should only be in effect when scan is called as meillo@21: \f(CWscan2\fP. meillo@20: meillo@20: .PP meillo@33: .B "Make every program a filter meillo@21: is hard to find in \s-1MH\s0. meillo@21: The reason therefore is that most of \s-1MH\s0's tools provide meillo@21: basic file system operations for the mailboxes. meillo@30: The reason is the same because of which meillo@21: \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP, and \f(CWrm\fP meillo@21: aren't filters neither. meillo@23: However, they build a basis on which filters can operate. meillo@23: \s-1MH\s0 does not provide many filters itself, but it is a basis meillo@23: to write filters for. meillo@30: An example would be a mail message text highlighter, meillo@30: that means a program that makes use of a color terminal to display meillo@30: header lines, quotations, and signatures in distinct colors. meillo@30: The author's version of this program, for instance, meillo@30: is a 25 line awk script. meillo@21: meillo@21: .PP meillo@33: .B "Build a prototype as soon as possible meillo@21: was again well followed by \s-1MH\s0. meillo@21: This tenet, of course, focuses on early development, which is meillo@21: long time ago for \s-1MH\s0. meillo@21: But without following this guideline at the very beginning, meillo@23: Bruce Borden may have not convinced the management of \s-1RAND\s0 meillo@23: to ever create \s-1MH\s0. meillo@23: In Bruce' own words: meillo@41: .[ meillo@41: %O FIXME meillo@41: .] meillo@21: .QP meillo@41: .ps -1 meillo@30: [...] but they [Stockton Gaines and Norm Shapiro] were not able meillo@23: to convince anyone that such a system would be fast enough to be usable. meillo@21: I proposed a very short project to prove the basic concepts, meillo@21: and my management agreed. meillo@21: Looking back, I realize that I had been very lucky with my first design. meillo@21: Without nearly enough design work, meillo@21: I built a working environment and some header files meillo@21: with key structures and wrote the first few \s-1MH\s0 commands: meillo@21: inc, show/next/prev, and comp. meillo@21: [...] meillo@21: With these three, I was able to convince people that the structure was viable. meillo@21: This took about three weeks. meillo@0: meillo@0: .NH 2 meillo@0: Problems meillo@0: .LP meillo@22: \s-1MH\s0, for sure is not without problems. meillo@30: There are two main problems: one is technical, the other is about human behavior. meillo@22: .PP meillo@22: \s-1MH\s0 is old and email today is very different to email in the time meillo@22: when \s-1MH\s0 was designed. meillo@22: \s-1MH\s0 adopted to the changes pretty well, but it is limited. meillo@22: For example in development resources. meillo@22: \s-1MIME\s0 support and support for different character encodings meillo@22: is available, but only on a moderate level. meillo@22: More active developers could quickly improve there. meillo@22: It is also limited by design, which is the larger problem. meillo@22: \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extend. meillo@22: These design conflicts are not easily solvable. meillo@22: Possibly, they require a redesign. meillo@30: Maybe \s-1IMAP\s0 is too different to the classic mail model which \s-1MH\s0 covers, meillo@30: hence \s-1MH\s0 may never work well with \s-1IMAP\s0. meillo@22: .PP meillo@22: The other kind of problem is human habits. meillo@22: When in this world almost all \s-1MUA\s0s are monolithic, meillo@22: it is very difficult to convince people to use a toolbox style \s-1MUA\s0 meillo@22: like \s-1MH\s0. meillo@22: The habits are so strong, that even people who understood the concept meillo@30: and advantages of \s-1MH\s0 do not like to switch, meillo@30: simply because \s-1MH\s0 is different. meillo@30: Unfortunately, the frontends to \s-1MH\s0, which could provide familiar look'n'feel, meillo@30: are quite outdated and thus not very appealing compared to the modern interfaces meillo@30: which monolithic \s-1MUA\s0s offer. meillo@20: meillo@20: .NH 2 meillo@20: Summary \s-1MH\s0 meillo@20: .LP meillo@31: \s-1MH\s0 is an \s-1MUA\s0 that follows the Unix Philosophy in its design meillo@31: and implementation. meillo@31: It consists of a toolchest of small tools, each of them does one job well. meillo@31: The tools are orthogonal to each other, to a large extend. meillo@31: However, for historical reasons, there also exist distinct tools meillo@31: that cover the same task. meillo@31: .PP meillo@31: The toolchest approach offers great flexibility to the user. meillo@31: He can use the complete power of the Unix shell with \s-1MH\s0. meillo@31: This makes \s-1MH\s0 a very powerful mail system. meillo@31: Extending and customizing \s-1MH\s0 is easy and encouraged, too. meillo@31: .PP meillo@31: Apart from the user's perspective, \s-1MH\s0 is development-friendly. meillo@31: Its overall design follows clear rules. meillo@31: The single tools do only one job, thus they are easy to understand, meillo@31: easy to write, and good to maintain. meillo@31: They are all independent and do not interfere with the others. meillo@31: Automated testing of their function is a straight forward task. meillo@31: .PP meillo@31: It is sad, that \s-1MH\s0's differentness is its largest problem, meillo@31: as its differentness is also its largest advantage. meillo@31: Unfortunately, for most people their habits are stronger meillo@31: than the attraction of the clear design and the power, \s-1MH\s0 offers. meillo@0: meillo@8: meillo@8: meillo@0: .NH 1 meillo@0: Case study: uzbl meillo@32: .LP meillo@32: The last chapter took a look on the \s-1MUA\s0 \s-1MH\s0, meillo@32: this chapter is about uzbl, a web browser that adheres to the Unix Philosophy. meillo@32: ``uzbl'' is the \fIlolcat\fP's word for the English adjective ``usable''. meillo@32: It is pronounced the identical. meillo@0: meillo@0: .NH 2 meillo@32: Historical background meillo@0: .LP meillo@32: Uzbl was started by Dieter Plaetinck in April 2009. meillo@32: The idea was born in a thread in the Arch Linux forum. meillo@32: .[ meillo@32: %O http://bbs.archlinux.org/viewtopic.php?id=67463 meillo@32: .] meillo@32: After some discussion about failures of well known web browsers, meillo@32: Plaetinck (alias Dieter@be) came up with a very sketchy proposal meillo@32: of how a better web browser could look like. meillo@32: To the question of another member, if Plaetinck would write that program, meillo@32: because it would sound fantastic, Plaetinck replied: meillo@32: ``Maybe, if I find the time ;-)''. meillo@32: .PP meillo@32: Fortunately, he found the time. meillo@32: One day later, the first prototype was out. meillo@32: One week later, uzbl had an own website. meillo@32: One month after the first code showed up, meillo@32: a mailing list was installed to coordinate and discuss further development. meillo@32: A wiki was set up to store documentation and scripts that showed up on the meillo@32: mailing list and elsewhere. meillo@32: .PP meillo@38: In the, now, one year of uzbl's existence, it was heavily developed in various branches. meillo@32: Plaetinck's task became more and more to only merge the best code from the meillo@32: different branches into his main branch, and to apply patches. meillo@32: About once a month, Plaetinck released a new version. meillo@32: In September 2009, he presented several forks of uzbl. meillo@38: Uzbl, actually, opened the field for a whole family of web browsers with similar shape. meillo@32: .PP meillo@32: In July 2009, \fILinux Weekly News\fP published an interview with Plaetinck about uzbl. meillo@32: In September 2009, the uzbl web browser was on \fISlashdot\fP. meillo@0: meillo@0: .NH 2 meillo@32: Contrasts to other web browsers meillo@0: .LP meillo@32: Like most \s-1MUA\s0s are monolithic, but \s-1MH\s0 is a toolchest, meillo@32: most web browsers are monolithic, but uzbl is a frontend to a toolchest. meillo@32: .PP meillo@32: Today, uzbl is divided into uzbl-core and uzbl-browser. meillo@32: Uzbl-core is, how its name already indicates, the core of uzbl. meillo@32: It handles commands and events to interface other programs, meillo@32: and also displays webpages by using webkit as render engine. meillo@32: Uzbl-browser combines uzbl-core with a bunch of handler scripts, a status bar, meillo@32: an event manager, yanking, pasting, page searching, zooming, and more stuff, meillo@32: to form a ``complete'' web browser. meillo@32: In the following text, the term ``uzbl'' usually stands for uzbl-browser, meillo@32: so uzbl-core is included. meillo@32: .PP meillo@32: Unlike most other web browsers, uzbl is mainly the mediator between the meillo@32: various tools that cover single jobs of web browsing. meillo@35: Therefore, uzbl listens for commands on a named pipe (fifo), a Unix socket, meillo@35: and on stdin, and it writes events to a Unix socket and to stdout. meillo@35: The graphical rendering of the webpage is done by webkit, a web content engine. meillo@35: Uzbl-core is build around this library. meillo@35: Loading a webpage in a running uzbl instance requires only: meillo@32: .DS meillo@32: .CW meillo@32: echo 'uri http://example.org' >/path/to/uzbl-fifo meillo@32: .DE meillo@32: .PP meillo@32: Downloads, browsing history, bookmarks, and thelike are not provided meillo@32: by uzbl-core itself, as they are in other web browsers. meillo@35: Uzbl-browser also only provides, so called, handler scripts that wrap meillo@35: external applications which provide the actual functionality. meillo@32: For instance, \fIwget\fP is used to download files and uzbl-browser meillo@32: includes a script that calls wget with appropriate options in meillo@32: a prepared environment. meillo@32: .PP meillo@32: Modern web browsers are proud to have addons, plugins, and modules, instead. meillo@32: This is their effort to achieve similar goals. meillo@35: But instead of using existing, external programs, modern web browsers meillo@35: include these functions, although they might be loaded at runtime. meillo@0: meillo@0: .NH 2 meillo@32: Discussion of the design meillo@0: .LP meillo@32: This section discusses uzbl in regard of the Unix Philosophy, meillo@32: as identified by Gancarz. meillo@32: meillo@32: .PP meillo@35: .B "Make each program do one thing well" . meillo@35: Uzbl tries to be a web browser and nothing else. meillo@36: The common definition of a web browser is, of course, highly influenced by meillo@36: existing implementations of web browsers, although they are degenerated. meillo@35: Web browsers should be programs to browse the web, and nothing more. meillo@35: This is the one thing they should do, as demanded by the Unix Philosophy. meillo@36: .PP meillo@36: Web browsers should, for instance, not manage downloads. meillo@35: This is the job download managers exist for. meillo@35: Download managers do primary care about being good in downloading files. meillo@35: Modern web browsers provide download management only as a secondary feature. meillo@35: How could they perform this job better, than programs that exist only for meillo@35: this very job? meillo@35: And how could anyone want less than the best download manager available? meillo@32: .PP meillo@35: A web browser's job is to let the user browse the web. meillo@35: This means, navigating through websites by following links. meillo@36: Rendering the \s-1HTML\s0 sources is a different job, too. meillo@36: It is covered by the webkit render engine, in uzbl's case. meillo@35: Audio and video content and files like PostScript, \s-1PDF\s0, and the like, meillo@36: are also not the job of a web browser. meillo@36: They should be handled by external applications \(en meillo@36: ones which's job is to handle such data. meillo@35: Uzbl strives to do it this way. meillo@36: .PP meillo@36: Remember Doug McIlroy: meillo@35: .I meillo@35: ``Write programs that do one thing and do it well. meillo@35: Write programs to work together.'' meillo@35: .R meillo@35: .PP meillo@35: The lesser tenet meillo@35: .B "allow the user to tailor the environment meillo@35: matches good here. meillo@35: There was the question, how anyone could want anything less than the meillo@35: best program for the job. meillo@36: But as personal preferences matter much, meillo@36: it is probably more important to ask: meillo@35: How could anyone want something else than his preferred program for the job? meillo@36: .PP meillo@35: Usually users want one program for one job. meillo@35: Hence, whenever the task is, for instance, downloading, meillo@36: exactly one download manager should be used. meillo@35: More advanced users might want to have this download manager in this meillo@35: situation and that one in that situation. meillo@35: They should be able to configure it this way. meillo@35: With uzbl, one can use any download manager the user wants. meillo@36: To switch to a different one, only one line in a small handler script meillo@35: needs to be changed. meillo@36: Alternatively it would be possible to query an entry in a global file meillo@36: or an environment variable, which specifies the download manager to use, meillo@35: in the handler script. meillo@36: .PP meillo@35: As uzbl does neither have its own download manager nor depends on a meillo@35: specific one, thus uzbl's browsing abilities will not be lowered by having meillo@35: a bad download manager. meillo@36: Uzbl's download capabilities will just as good as the ones of the best meillo@36: download manager available on the system. meillo@38: Of course, this applies to all of the other supplementary tools, too. meillo@32: meillo@32: .PP meillo@36: .B "Use software leverage to your advantage" . meillo@36: Shell scripts are a good choice to extend uzbl. meillo@36: Uzbl is designed to be extended by external tools. meillo@36: These external tools are usually wrapped by small handler shell scripts. meillo@36: Shell scripts are the glue in this approach. meillo@36: They make the various parts fit together. meillo@36: .PP meillo@36: As an example, the history mechanism of uzbl shall be presented. meillo@36: Uzbl is configured to spawn a script to append an entry to the history meillo@36: whenever the event of a fully loaded page occurs. meillo@36: The script to append the entry to the history not much more than: meillo@36: .DS meillo@36: .CW meillo@36: #!/bin/sh meillo@36: file=/path/to/uzbl-history meillo@36: echo `date +'%Y-%m-%d %H:%M:%S'`" $6 $7" >> $file meillo@36: .DE meillo@36: \f(CW$6\fP and \f(CW$7\fP expand to the \s-1URL\s0 and the page title. meillo@36: For loading an entry, a key is bound to spawn a load from history script. meillo@36: The script reverses the history to have newer entries first, meillo@36: then displays \fIdmenu\fP to select an item, meillo@36: and afterwards writes the selected \s-1URL\s0 into uzbl's command input pipe. meillo@36: With error checking and corner cases removed, the script looks like this: meillo@36: .DS meillo@36: .CW meillo@36: #!/bin/sh meillo@36: file=/path/to/uzbl-history meillo@36: goto=`tac $file | dmenu | cut -d' ' -f 3` meillo@36: echo "uri $goto" > $4 meillo@36: .DE meillo@36: \f(CW$4\fP expands to the path of the command input pipe of the current meillo@36: uzbl instance. meillo@32: meillo@32: .PP meillo@33: .B "Avoid captive user interfaces" . meillo@36: One could say, that uzbl, to a large extend, actually \fIis\fP meillo@36: a captive user interface. meillo@37: But the difference to most other web browsers is, that uzbl is only meillo@37: the captive user interface frontend and the core of the backend. meillo@38: Many parts of the backend are independent of uzbl. meillo@37: Some are distributed with uzbl, for some external programs, handler scripts meillo@37: are distributed, arbitrary additional functionality can be added if desired. meillo@37: .PP meillo@37: The frontend is captive \(en that is true. meillo@37: This is okay for the task of browsing the web, as this task is only relevant meillo@37: for humans. meillo@37: Automated programs would \fIcrawl\fP the web. meillo@37: That means, they read the source directly. meillo@37: The source includes all the semantics. meillo@37: The graphical representation is just for humans to transfer the semantics meillo@37: more intuitively. meillo@32: meillo@32: .PP meillo@33: .B "Make every program a filter" . meillo@37: Graphical web browsers are almost dead ends in the chain of information flow. meillo@37: Thus it is difficult to see what graphical web browsers should filter. meillo@37: Graphical web browsers exist almost only for interactive use by humans. meillo@37: The only case when one might want to automate the rendering function is meillo@37: to generate images of rendered webpages. meillo@37: meillo@37: .PP meillo@37: .B "Small is beautiful" meillo@38: is not easy to apply to a web browser, primary because modern web technology meillo@38: is very complex; hence the rendering task is very complex. meillo@37: Modern web browsers will always consist of many thousand lines of code, meillo@37: unfortunately. meillo@37: Using the toolchest approach and wrappers can split the browser into meillo@37: several small parts, tough. meillo@37: .PP meillo@37: Uzbl-core consists of about 3\,500 lines of C code. meillo@37: The distribution includes another 3\,500 lines of Shell and Python code, meillo@37: which are the handler scripts and plugins like a modal interface. meillo@38: Further more, uzbl uses functionality of external tools like meillo@38: \fIwget\fP and \fInetcat\fP. meillo@37: Up to this point, uzbl looks pretty neat and small. meillo@38: The ugly part of uzbl is the web content renderer, webkit. meillo@37: Webkit consists of roughly 400\,000 (!) lines of code. meillo@38: Unfortunately, small web render engines are not possible anymore meillo@38: because of the modern web. meillo@38: The problems section will explain this in more detail. meillo@35: meillo@35: .PP meillo@35: .B "Build a prototype as soon as possible" . meillo@35: Plaetinck made his code public, right from the beginning. meillo@38: Discussion and development was, and still is, open to everyone interested. meillo@38: Development versions of uzbl can be obtained very simply from the code meillo@38: repository. meillo@38: Within the first year of uzbl's existence, a new version was released meillo@35: more often than once a month. meillo@38: Different forks and branches arose. meillo@38: They introduced new features, which were tested for suitability. meillo@35: The experiences of using prototypes influenced further development. meillo@35: Actually, all development was community driven. meillo@38: Plaetinck says, three months after uzbl's birth: meillo@35: ``Right now I hardly code anything myself for Uzbl. meillo@35: I just merge in other people's code, ponder a lot, and lead the discussions.'' meillo@35: .[ meillo@36: %A FIXME meillo@35: %O http://lwn.net/Articles/341245/ meillo@35: .] meillo@32: meillo@0: meillo@0: .NH 2 meillo@0: Problems meillo@0: .LP meillo@38: Similar to \s-1MH\s0, uzbl, too suffers from being different. meillo@38: It is sad, but people use what they know. meillo@38: Fortunately, uzbl's user interface can look and feel very much the meillo@38: same as the one of the well known web browsers, meillo@38: hiding the internal differences. meillo@38: But uzbl has to provide this similar look and feel to be accepted meillo@38: as a ``normal'' browser by ``normal'' users. meillo@37: .PP meillo@38: The more important problem is the modern web. meillo@38: The modern web is simply broken. meillo@38: It has state in a state-less protocol, meillo@38: it misuses technologies, meillo@38: and it is helplessly overloaded. meillo@38: The result are web content render engines that must consist meillo@38: of hundreds of thousands lines of code. meillo@38: They also must combine and integrate many different technologies, meillo@38: only to make our modern web usable. meillo@38: Website to image converter are hardly possible to run without meillo@38: human interaction because of state in sessions, impossible meillo@38: deep-linking, and unautomatable technologies. meillo@37: .PP meillo@38: The web was misused to provide all kinds of imaginable wishes. meillo@38: Now web browsers, and eventually the users, suffer from it. meillo@37: meillo@8: meillo@32: .NH 2 meillo@32: Summary uzbl meillo@32: .LP meillo@38: ``Uzbl is a browser that adheres to the Unix Philosophy'', meillo@38: that is how uzbl is seen by its authors. meillo@38: Indeed, uzbl follows the Unix Philosophy in many ways. meillo@38: It consists of independent parts that work together, meillo@38: its core is mainly a mediator which glues the parts together. meillo@38: .PP meillo@38: Software leverage can excellently be seen in uzbl. meillo@38: It makes use of external tools, separates independent tasks meillo@38: in independent parts, and glues them together with small meillo@38: handler scripts, around uzbl-core. meillo@38: .PP meillo@38: As uzbl, more or less, consists of a set of tools and a bit meillo@38: of glue, anyone can put the parts together and expand it meillo@38: in any desired way. meillo@38: Uzbl is very flexible and customizable. meillo@38: These properties make it valuable for advanced users, meillo@38: but may keep novice users from using it. meillo@38: .PP meillo@38: Uzbl's main problem is the modern web, that makes it hard meillo@38: to design a sane web browser. meillo@38: Despite this bad situation, uzbl does a fairly good job. meillo@32: meillo@8: meillo@0: .NH 1 meillo@0: Final thoughts meillo@0: meillo@0: .NH 2 meillo@0: Quick summary meillo@0: .LP meillo@0: good design meillo@0: .LP meillo@0: unix phil meillo@0: .LP meillo@0: case studies meillo@0: meillo@0: .NH 2 meillo@0: Why people should choose meillo@0: .LP meillo@0: Make the right choice! meillo@0: meillo@0: .nr PI .5i meillo@0: .rm ]< meillo@0: .de ]< meillo@0: .LP meillo@0: .de FP meillo@0: .IP \\\\$1. meillo@0: \\.. meillo@0: .rm FS FE meillo@0: .. meillo@0: .SH meillo@0: References meillo@0: .[ meillo@0: $LIST$ meillo@0: .] meillo@0: .wh -1p