meillo@49: .so style meillo@42: meillo@0: .TL meillo@42: .ps +4 meillo@6: Why the Unix Philosophy still matters meillo@0: .AU meillo@0: markus schnalke meillo@0: .AB meillo@1: .ti \n(.iu meillo@39: This paper explains the importance of the Unix Philosophy for software design. meillo@0: Today, few software designers are aware of these concepts, meillo@39: and thus a lot of modern software is more limited than necessary meillo@39: and makes less use of software leverage than possible. meillo@38: Knowing and following the guidelines of the Unix Philosophy makes software more valuable. meillo@0: .AE meillo@0: meillo@2: .FS meillo@2: .ps -1 meillo@39: This paper was prepared for the ``Software Analysis'' seminar at University Ulm. meillo@47: Mentor was professor Franz Schweiggert. meillo@55: Handed in on 2010-04-16. meillo@39: You may retrieve this document from meillo@39: .CW \s-1http://marmaro.de/docs \ . meillo@2: .FE meillo@2: meillo@48: .H 1 Introduction meillo@0: .LP meillo@40: The Unix Philosophy is the essence of how the Unix operating system, meillo@40: especially its toolchest, was designed. meillo@57: It is not a limited set of fixed rules, meillo@40: but a loose set of guidelines which tell how to write software that meillo@57: suites Unix well. meillo@57: Actually, the Unix Philosophy describes what is common in typical Unix software. meillo@40: The Wikipedia has an accurate definition: meillo@40: .[ meillo@44: wikipedia meillo@44: unix philosophy meillo@40: .] meillo@40: .QP meillo@40: The \fIUnix philosophy\fP is a set of cultural norms and philosophical meillo@40: approaches to developing software based on the experience of leading meillo@40: developers of the Unix operating system. meillo@1: .PP meillo@40: As there is no single definition of the Unix Philosophy, meillo@40: several people have stated their view on what it comprises. meillo@1: Best known are: meillo@1: .IP \(bu meillo@1: Doug McIlroy's summary: ``Write programs that do one thing and do it well.'' meillo@1: .[ meillo@44: mahoney meillo@44: oral history meillo@1: .] meillo@1: .IP \(bu meillo@1: Mike Gancarz' book ``The UNIX Philosophy''. meillo@1: .[ meillo@44: gancarz meillo@44: unix philosophy meillo@1: .] meillo@1: .IP \(bu meillo@1: Eric S. Raymond's book ``The Art of UNIX Programming''. meillo@1: .[ meillo@44: raymond meillo@44: art of unix programming meillo@1: .] meillo@0: .LP meillo@1: These different views on the Unix Philosophy have much in common. meillo@40: Especially, the main concepts are similar in all of them. meillo@40: McIlroy's definition can surely be called the core of the Unix Philosophy, meillo@57: but the fundamental idea behind it all is ``small is beautiful''. meillo@40: meillo@40: .PP meillo@45: The Unix Philosophy explains how to design good software for Unix. meillo@57: Many concepts described here are based on Unix facilities. meillo@40: Other operating systems may not offer such facilities, meillo@57: hence it may not be possible to design software for such systems meillo@57: according to the Unix Philosophy. meillo@40: .PP meillo@57: The Unix Philosophy has an idea of what the process of software development meillo@41: should look like, but large parts of the philosophy are quite independent meillo@45: from a concrete development process. meillo@41: However, one will soon recognize that some development processes work well meillo@41: with the ideas of the Unix Philosophy and support them, while others are meillo@41: at cross-purposes. meillo@45: Kent Beck's books about Extreme Programming are valuable supplemental meillo@45: resources on this topic. meillo@1: .PP meillo@57: The question of how to actually write code and how the code should look meillo@57: in detail, are beyond the scope of this paper. meillo@57: Kernighan and Pike's book ``The Practice of Programming'' meillo@41: .[ meillo@44: kernighan pike meillo@44: practice of programming meillo@41: .] meillo@57: covers this topic. meillo@57: Its point of view corresponds to the one espoused in this paper. meillo@0: meillo@48: .H 1 "Importance of software design in general meillo@0: .LP meillo@57: Software design consists of planning how the internal structure meillo@57: and external interfaces of software should look. meillo@39: It has nothing to do with visual appearance. meillo@57: If we were to compare a program to a car, then its color would not matter. meillo@39: Its design would be the car's size, its shape, the locations of doors, meillo@45: the passenger/space ratio, the available controls and instruments, meillo@45: and so forth. meillo@39: .PP meillo@57: Why should software be designed at all? meillo@57: It is accepted as general knowledge, meillo@57: that even a bad plan is better than no plan. meillo@57: Not designing software means programming without a plan. meillo@57: This will surely lead to horrible results, meillo@57: being horrible to use and horrible to maintain. meillo@39: These two aspects are the visible ones. meillo@45: Often invisible though, are the wasted possible gains. meillo@39: Good software design can make these gains available. meillo@2: .PP meillo@57: A software's design deals with qualitative properties. meillo@39: Good design leads to good quality, and quality is important. meillo@57: Any car may be able to drive from point A to point B, meillo@57: but it depends on the qualitative decisions made in the design of the vehicle, meillo@57: whether it is a good choice for passenger transport or not, meillo@57: whether it is a good choice for a rough mountain area, meillo@57: and whether the ride will be fun. meillo@39: meillo@2: .PP meillo@57: Requirements for a piece of software are twofold: meillo@39: functional and non-functional. meillo@39: .IP \(bu meillo@57: Functional requirements directly define the software's functions. meillo@39: They are the reason why software gets written. meillo@39: Someone has a problem and needs a tool to solve it. meillo@39: Being able to solve the problem is the main functional goal. meillo@57: This is the driving force behind all programming effort. meillo@39: Functional requirements are easier to define and to verify. meillo@39: .IP \(bu meillo@45: Non-functional requirements are called \fIquality\fP requirements, too. meillo@57: The quality of software shows through the properties that are not directly meillo@57: related to the software's basic functions. meillo@45: Tools of bad quality often do solve the problems they were written for, meillo@57: but introduce problems and difficulties for usage and development later on. meillo@57: Qualitative aspects are often overlooked at first sight, meillo@45: and are often difficult to define clearly and to verify. meillo@2: .PP meillo@54: Quality is hardly interesting when software gets built initially, meillo@57: but it has a high impact on usability and maintenance of the software later. meillo@57: A short-sighted person might see the process of developing software as meillo@57: one mainly concerned with building something up. meillo@57: But, experience shows that building software the first time is meillo@57: only a small portion of the overall work involved. meillo@45: Bug fixing, extending, rebuilding of parts \(en maintenance work \(en meillo@57: soon take a large part of the time spent on a software project. meillo@45: And of course, the time spent actually using the software. meillo@6: These processes are highly influenced by the software's quality. meillo@39: Thus, quality must not be neglected. meillo@45: However, the problem with quality is that you hardly ``stumble over'' meillo@39: bad quality during the first build, meillo@45: although this is the time when you should care about good quality most. meillo@6: .PP meillo@54: Software design has little to do with the basic function of software \(en meillo@39: this requirement will get satisfied anyway. meillo@57: Software design is more about quality aspects. meillo@39: Good design leads to good quality, bad design to bad quality. meillo@54: The primary functions of software will be affected modestly by bad quality, meillo@57: but good quality can provide a lot of additional benefits, meillo@57: even at places one never expected it. meillo@6: .PP meillo@45: The ISO/IEC\|9126-1 standard, part\|1, meillo@6: .[ meillo@44: iso product quality meillo@6: .] meillo@57: defines the quality model as consisting of: meillo@6: .IP \(bu meillo@6: .I Functionality meillo@6: (suitability, accuracy, inter\%operability, security) meillo@6: .IP \(bu meillo@6: .I Reliability meillo@6: (maturity, fault tolerance, recoverability) meillo@6: .IP \(bu meillo@6: .I Usability meillo@6: (understandability, learnability, operability, attractiveness) meillo@6: .IP \(bu meillo@6: .I Efficiency meillo@9: (time behavior, resource utilization) meillo@6: .IP \(bu meillo@6: .I Maintainability meillo@23: (analyzability, changeability, stability, testability) meillo@6: .IP \(bu meillo@6: .I Portability meillo@6: (adaptability, installability, co-existence, replaceability) meillo@6: .LP meillo@57: Good design can improve these properties in software; meillo@57: poorly designed software likely suffers in these areas. meillo@7: .PP meillo@7: One further goal of software design is consistency. meillo@57: Consistency eases understanding, using, and working on things. meillo@57: Consistent internal structure and consistent external interfaces meillo@39: can be provided by good design. meillo@7: .PP meillo@39: Software should be well designed because good design avoids many meillo@57: problems during its lifetime. meillo@57: Also, because good design can offer much additional gain. meillo@57: Indeed, much effort should be spent on good design to make software more valuable. meillo@57: The Unix Philosophy provides a way to design software well. meillo@7: It offers guidelines to achieve good quality and high gain for the effort spent. meillo@0: meillo@0: meillo@48: .H 1 "The Unix Philosophy meillo@4: .LP meillo@61: The origins of the Unix Philosophy have already been introduced. meillo@8: This chapter explains the philosophy, oriented on Gancarz, meillo@55: .[ meillo@55: gancarz meillo@55: unix philosophy meillo@55: .] meillo@8: and shows concrete examples of its application. meillo@5: meillo@48: .H 2 Pipes meillo@4: .LP meillo@61: The following examples demonstrate how the Unix Philosophy is applied. meillo@4: Knowledge of using the Unix shell is assumed. meillo@4: .PP meillo@4: Counting the number of files in the current directory: meillo@41: .DS meillo@4: ls | wc -l meillo@4: .DE meillo@4: The meillo@4: .CW ls meillo@4: command lists all files in the current directory, one per line, meillo@4: and meillo@4: .CW "wc -l meillo@8: counts the number of lines. meillo@4: .PP meillo@8: Counting the number of files that do not contain ``foo'' in their name: meillo@41: .DS meillo@4: ls | grep -v foo | wc -l meillo@4: .DE meillo@4: Here, the list of files is filtered by meillo@4: .CW grep meillo@45: to remove all lines that contain ``foo''. meillo@45: The rest equals the previous example. meillo@4: .PP meillo@61: Finding the five largest entries in the current directory: meillo@41: .DS meillo@4: du -s * | sort -nr | sed 5q meillo@4: .DE meillo@4: .CW "du -s * meillo@45: returns the recursively summed sizes of all files in the current directory meillo@8: \(en no matter if they are regular files or directories. meillo@4: .CW "sort -nr meillo@45: sorts the list numerically in reverse order (descending). meillo@4: Finally, meillo@4: .CW "sed 5q meillo@4: quits after it has printed the fifth line. meillo@4: .PP meillo@4: The presented command lines are examples of what Unix people would use meillo@4: to get the desired output. meillo@61: There are other ways to get the same output; meillo@61: it is the user's decision which way to go. meillo@14: .PP meillo@8: The examples show that many tasks on a Unix system meillo@4: are accomplished by combining several small programs. meillo@61: The connection between the programs is denoted by the pipe operator `|'. meillo@4: .PP meillo@4: Pipes, and their extensive and easy use, are one of the great meillo@4: achievements of the Unix system. meillo@61: Pipes were possible in earlier operating systems, meillo@61: but never before have they been such a central part of the concept. meillo@61: In the early seventies when Doug McIlroy introduced pipes into the meillo@4: Unix system, meillo@4: ``it was this concept and notation for linking several programs together meillo@4: that transformed Unix from a basic file-sharing system to an entirely new way of computing.'' meillo@4: .[ meillo@44: aughenbaugh meillo@44: unix oral history meillo@45: .] meillo@4: .PP meillo@4: Being able to specify pipelines in an easy way is, meillo@61: however, not enough by itself; meillo@61: it is only one half. meillo@4: The other is the design of the programs that are used in the pipeline. meillo@61: They need interfaces that allow them to be used in this way. meillo@5: meillo@48: .H 2 "Interface design meillo@5: .LP meillo@61: Unix is, first of all, simple \(en everything is a file. meillo@5: Files are sequences of bytes, without any special structure. meillo@45: Programs should be filters, which read a stream of bytes from standard input (stdin) meillo@45: and write a stream of bytes to standard output (stdout). meillo@8: If the files \fIare\fP sequences of bytes, meillo@8: and the programs \fIare\fP filters on byte streams, meillo@45: then there is exactly one data interface. meillo@45: Hence it is possible to combine programs in any desired way. meillo@5: .PP meillo@45: Even a handful of small programs yields a large set of combinations, meillo@5: and thus a large set of different functions. meillo@5: This is leverage! meillo@5: If the programs are orthogonal to each other \(en the best case \(en meillo@5: then the set of different functions is greatest. meillo@5: .PP meillo@61: Programs can also have a separate control interface meillo@61: in addition to their data interface. meillo@61: The control interface is often called the ``user interface'', meillo@11: because it is usually designed to be used by humans. meillo@61: The Unix Philosophy discourages the assumption that the user will be human. meillo@11: Interactive use of software is slow use of software, meillo@11: because the program waits for user input most of the time. meillo@61: Interactive software also requires the user to be in front of the computer, meillo@61: occupying his attention during usage. meillo@11: .PP meillo@61: Now, back to the idea of combining several small programs meillo@61: to perform a more specific function: meillo@61: If these single tools were all interactive, meillo@11: how would the user control them? meillo@61: It is not only a problem to control several programs at once meillo@61: if they run at the same time; meillo@61: it is also very inefficient to have to control each program meillo@61: when they are intended to act in concert. meillo@61: Hence, the Unix Philosophy discourages designing programs which demand meillo@61: interactive use. meillo@11: The behavior of programs should be defined at invocation. meillo@45: This is done by specifying arguments to the program call meillo@45: (command line switches). meillo@61: Gancarz discusses this topic as ``avoid[ing] captive user interfaces''. meillo@46: .[ [ meillo@44: gancarz unix philosophy meillo@46: .], page 88 ff.] meillo@11: .PP meillo@61: Non-interactive use is also an advantage for testing during development. meillo@61: Testing interactive programs is much more complicated meillo@61: than testing non-interactive counterparts. meillo@5: meillo@48: .H 2 "The toolchest approach meillo@5: .LP meillo@5: A toolchest is a set of tools. meillo@61: Instead of one big tool for all tasks, there are many small tools, meillo@5: each for one task. meillo@61: Difficult tasks are solved by combining several small, simple tools. meillo@5: .PP meillo@11: The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs meillo@11: that are filters on byte streams. meillo@54: They are, to a large extent, unrelated in their function. meillo@11: Hence, the Unix toolchest provides a large set of functions meillo@11: that can be accessed by combining the programs in the desired way. meillo@11: .PP meillo@61: The act of software development benefits from small toolchest programs, too. meillo@61: Writing small programs is generally easier and less error-prone meillo@61: than writing large programs. meillo@61: Hence, writing a large set of small programs is still easier and meillo@61: less error-prone than writing one large program with all the meillo@61: functionality included. meillo@61: If the small programs are combinable, then they offer even an even larger set meillo@61: of functions than the single monolithic program. meillo@45: Hence, one gets two advantages out of writing small, combinable programs: meillo@45: They are easier to write and they offer a greater set of functions through meillo@45: combination. meillo@5: .PP meillo@61: There are also two main drawbacks of the toolchest approach. meillo@45: First, one simple, standardized interface has to be sufficient. meillo@5: If one feels the need for more ``logic'' than a stream of bytes, meillo@61: then a different approach might be required. meillo@61: Also, a design where a stream of bytes is sufficient, meillo@61: might not be conceivable. meillo@8: By becoming more familiar with the ``Unix style of thinking'', meillo@8: developers will more often and easier find simple designs where meillo@8: a stream of bytes is a sufficient interface. meillo@8: .PP meillo@61: The second drawback of the toolchest approach concerns the users. meillo@61: A toolchest is often more difficult to use because meillo@61: it is necessary to become familiar with each tool and meillo@61: be able to choose and use the right one in any given situation. meillo@61: Additionally, one needs to know how to combine the tools in a sensible way. meillo@61: The issue is similar to having a sharp knife \(en meillo@61: it is a powerful tool in the hand of a master, meillo@61: but of no value in the hand of an unskilled person. meillo@61: However, learning single, small tools of a toolchest is often easier than meillo@45: learning a complex tool. meillo@61: The user will already have a basic understanding of an as yet unknown tool meillo@45: if the tools of a toolchest have a common, consistent style. meillo@61: He will be able to transfer knowledge of one tool to another. meillo@5: .PP meillo@61: This second drawback can be removed to a large extent meillo@45: by adding wrappers around the basic tools. meillo@61: Novice users do not need to learn several tools if a professional wraps meillo@45: complete command lines into a higher-level script. meillo@5: Note that the wrapper script still calls the small tools; meillo@45: it is just like a skin around them. meillo@61: No complexity is added this way, meillo@61: but new programs can be created out of existing one with very little effort. meillo@5: .PP meillo@5: A wrapper script for finding the five largest entries in the current directory meillo@61: might look like this: meillo@41: .DS meillo@5: #!/bin/sh meillo@5: du -s * | sort -nr | sed 5q meillo@5: .DE meillo@61: The script itself is just a text file that calls the commands meillo@61: that a professional user would type in directly. meillo@61: It is probably beneficial to make the program flexible in regard to meillo@61: the number of entries it prints: meillo@41: .DS meillo@8: #!/bin/sh meillo@8: num=5 meillo@8: [ $# -eq 1 ] && num="$1" meillo@8: du -sh * | sort -nr | sed "${num}q" meillo@8: .DE meillo@61: This script acts like the one before when called without an argument, meillo@61: but the user can also specify a numerical argument to define the number meillo@61: of lines to print. meillo@61: One can surely imagine even more flexible versions; meillo@61: however, they will still rely on the external programs meillo@61: which actually do the work. meillo@5: meillo@48: .H 2 "A powerful shell meillo@8: .LP meillo@61: The Unix shell provides the ability to combine small programs into large ones. meillo@61: But a powerful shell is a great feature in other ways, too; meillo@61: for instance, by being scriptable. meillo@61: Control statements are built into the shell meillo@61: and the functions are the normal programs of the system. meillo@61: As the programs are already known, meillo@45: learning to program in the shell becomes easy. meillo@8: Using normal programs as functions in the shell programming language meillo@10: is only possible because they are small and combinable tools in a toolchest style. meillo@8: .PP meillo@61: The Unix shell encourages writing small scripts, meillo@61: by combining existing programs because it is so easy to do. meillo@8: This is a great step towards automation. meillo@8: It is wonderful if the effort to automate a task equals the effort meillo@45: to do the task a second time by hand. meillo@45: If this holds, meillo@45: then the user will be happy to automate everything he does more than once. meillo@8: .PP meillo@8: Small programs that do one job well, standardized interfaces between them, meillo@61: a mechanism to combine parts to larger parts, and an easy way to automate tasks meillo@61: will inevitably produce software leverage, meillo@61: achieving multiple times the benefit of the initial investment. meillo@10: .PP meillo@10: The shell also encourages rapid prototyping. meillo@10: Many well known programs started as quickly hacked shell scripts, meillo@61: and turned into ``real'' programs later written in C. meillo@61: Building a prototype first is a way to avoid the biggest problems meillo@10: in application development. meillo@45: Fred Brooks explains in ``No Silver Bullet'': meillo@10: .[ meillo@44: brooks meillo@44: no silver bullet meillo@10: .] meillo@10: .QP meillo@10: The hardest single part of building a software system is deciding precisely what to build. meillo@10: No other part of the conceptual work is so difficult as establishing the detailed meillo@10: technical requirements, [...]. meillo@10: No other part of the work so cripples the resulting system if done wrong. meillo@10: No other part is more difficult to rectify later. meillo@10: .PP meillo@45: Writing a prototype is a great method for becoming familiar with the requirements meillo@45: and to run into real problems early. meillo@47: .[ [ meillo@47: gancarz meillo@47: unix philosophy meillo@47: .], page 28 f.] meillo@45: .PP meillo@54: Prototyping is often seen as a first step in building software. meillo@10: This is, of course, good. meillo@10: However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping: meillo@61: After having built the prototype, one might notice that the prototype is already meillo@10: \fIgood enough\fP. meillo@61: Hence, no reimplementation in a more sophisticated programming language meillo@45: might be of need, at least for the moment. meillo@23: Maybe later, it might be necessary to rewrite the software, but not now. meillo@45: By delaying further work, one keeps the flexibility to react on meillo@10: changing requirements. meillo@10: Software parts that are not written will not miss the requirements. meillo@61: Well known is Gordon Bell's classic saying: meillo@61: ``The cheapest, fastest, and most reliable components are those meillo@61: that aren't there.'' meillo@61: .\" FIXME: ref? meillo@10: meillo@48: .H 2 "Worse is better meillo@10: .LP meillo@45: The Unix Philosophy aims for the 90% solution; meillo@10: others call it the ``Worse is better'' approach. meillo@47: Experience from real life projects shows: meillo@10: .PP meillo@61: (1) It is almost impossible to define the meillo@10: requirements completely and correctly the first time. meillo@45: Hence one should not try to; one will fail anyway. meillo@45: .PP meillo@45: (2) Requirements change during time. meillo@10: Hence it is best to delay requirement-based design decisions as long as possible. meillo@61: Software should be small and flexible as long as possible in order meillo@61: to react to changing requirements. meillo@61: Shell scripts, for example, are more easily adjusted than C programs. meillo@45: .PP meillo@45: (3) Maintenance work is hard work. meillo@45: Hence, one should keep the amount of code as small as possible; meillo@61: it should only fulfill the \fIcurrent\fP requirements. meillo@61: Software parts that will be written in the future meillo@61: do not need maintenance until that time. meillo@10: .PP meillo@47: See Brooks' ``The Mythical Man-Month'' for reference. meillo@47: .[ [ meillo@47: brooks meillo@47: mythical man-month meillo@47: .], page 115 ff.] meillo@47: .PP meillo@10: Starting with a prototype in a scripting language has several advantages: meillo@10: .IP \(bu meillo@10: As the initial effort is low, one will likely start right away. meillo@10: .IP \(bu meillo@61: Real requirements can be identified quickly since working parts are meillo@61: available sooner. meillo@10: .IP \(bu meillo@54: When software is usable and valuable, it gets used, and thus tested. meillo@61: This ensures that problems will be found in the early stages of development. meillo@10: .IP \(bu meillo@61: The prototype might be enough for the moment; meillo@61: thus, further work can be delayed until a time meillo@61: when one knows about the requirements and problems more thoroughly. meillo@10: .IP \(bu meillo@61: Implementing only the parts that are actually needed at the moment meillo@61: introduces less programming and maintenance work. meillo@10: .IP \(bu meillo@61: If the situation changes such that the software is not needed anymore, meillo@61: then less effort was spent on the project than it would have been meillo@61: if a different approach had been taken. meillo@10: meillo@48: .H 2 "Upgrowth and survival of software meillo@11: .LP meillo@61: So far, \fIwriting\fP or \fIbuilding\fP software has been discussed. meillo@61: Although ``writing'' and ``building'' are just verbs, meillo@61: they do imply a specific view on the work process they describe. meillo@61: A better verb would be to \fI``grow''\fP. meillo@12: Creating software in the sense of the Unix Philosophy is an incremental process. meillo@61: It starts with an initial prototype, which evolves as requirements change. meillo@12: A quickly hacked shell script might become a large, sophisticated, meillo@13: compiled program this way. meillo@13: Its lifetime begins with the initial prototype and ends when the software is not used anymore. meillo@61: While alive, it will be extended, rearranged, rebuilt. meillo@12: Growing software matches the view that ``software is never finished. It is only released.'' meillo@46: .[ [ meillo@44: gancarz meillo@44: unix philosophy meillo@46: .], page 26] meillo@12: .PP meillo@13: Software can be seen as being controlled by evolutionary processes. meillo@13: Successful software is software that is used by many for a long time. meillo@61: This implies that the software is necessary, useful, and better than the alternatives. meillo@61: Darwin describes ``the survival of the fittest.'' meillo@12: .[ meillo@44: darwin meillo@44: origin of species meillo@12: .] meillo@61: In relation to software, the most successful software is the fittest; meillo@61: the one that survives. meillo@13: (This may be at the level of one creature, or at the level of one species.) meillo@13: The fitness of software is affected mainly by four properties: meillo@15: portability of code, portability of data, range of usability, and reusability of parts. meillo@13: .PP meillo@15: (1) meillo@61: .I "``Portability of code'' meillo@61: means using high-level programming languages, meillo@13: sticking to the standard, meillo@47: .[ [ meillo@47: kernighan pike meillo@47: practice of programming meillo@47: .], chapter\|8] meillo@13: and avoiding optimizations that introduce dependencies on specific hardware. meillo@61: Hardware has a much shorter lifespan than software. meillo@61: By chaining software to specific hardware, meillo@61: its lifetime is limited to that of this hardware. meillo@13: In contrast, software should be easy to port \(en meillo@23: adaptation is the key to success. meillo@13: .PP meillo@15: (2) meillo@61: .I "``Portability of data'' meillo@15: is best achieved by avoiding binary representations meillo@61: to store data, since binary representations differ from machine to machine. meillo@23: Textual representation is favored. meillo@61: Historically, \s-1ASCII\s0 was the character set of choice; meillo@61: for the future, \s-1UTF\s0-8 might be the better way forward. meillo@13: Important is that it is a plain text representation in a meillo@61: very common character set encoding. meillo@13: Apart from being able to transfer data between machines, meillo@61: readable data has the great advantage that humans are able to directly meillo@45: read and edit it with text editors and other tools from the Unix toolchest. meillo@47: .[ [ meillo@47: gancarz meillo@47: unix philosophy meillo@47: .], page 56 ff.] meillo@13: .PP meillo@15: (3) meillo@15: A large meillo@61: .I "``range of usability'' meillo@23: ensures good adaptation, and thus good survival. meillo@61: It is a special distinction when software becomes used in fields of endeavor, meillo@61: the original authors never imagined. meillo@13: Software that solves problems in a general way will likely be used meillo@45: for many kinds of similar problems. meillo@45: Being too specific limits the range of usability. meillo@13: Requirements change through time, thus use cases change or even vanish. meillo@61: As a good example of this point, meillo@13: Allman identifies flexibility to be one major reason for sendmail's success: meillo@13: .[ meillo@44: allman meillo@44: sendmail meillo@13: .] meillo@13: .QP meillo@13: Second, I limited myself to the routing function [...]. meillo@13: This was a departure from the dominant thought of the time, [...]. meillo@13: .QP meillo@45: Third, the sendmail configuration file was flexible enough to adapt meillo@13: to a rapidly changing world [...]. meillo@12: .LP meillo@45: Successful software adapts itself to the changing world. meillo@13: .PP meillo@15: (4) meillo@61: .I "``Reusability of parts'' meillo@61: goes one step further. meillo@61: Software may become obsolete and completely lose its field of action, meillo@61: but the constituent parts of the software may be general and independent enough meillo@13: to survive this death. meillo@54: If software is built by combining small independent programs, meillo@45: then these parts are readily available for reuse. meillo@61: Who cares that the large program is a failure, meillo@61: if parts of it become successful instead? meillo@10: meillo@48: .H 2 "Summary meillo@0: .LP meillo@61: This chapter explained ideas central to the Unix Philosophy. meillo@45: For each of the ideas, the advantages they introduce were explained. meillo@61: The Unix Philosophy is a set of guidelines that help in the design of meillo@61: more valuable software. meillo@61: From the viewpoint of a software developer or software designer, meillo@61: the Unix Philosophy provides answers to many software design problems. meillo@14: .PP meillo@61: The various ideas comprising the Unix Philosophy are very interweaved meillo@14: and can hardly be applied independently. meillo@61: The most important messages are: meillo@45: .I "``Keep it simple!''" , meillo@14: .I "``Do one thing well!''" , meillo@14: and meillo@14: .I "``Use software leverage!'' meillo@0: meillo@8: meillo@8: meillo@48: .H 1 "Case study: \s-1MH\s0 meillo@18: .LP meillo@30: The previous chapter introduced and explained the Unix Philosophy meillo@18: from a general point of view. meillo@61: The driving force was that of the guidelines; meillo@61: references to existing software were given only sparsely. meillo@18: In this and the next chapter, concrete software will be meillo@18: the driving force in the discussion. meillo@18: .PP meillo@23: This first case study is about the mail user agents (\s-1MUA\s0) meillo@54: \s-1MH\s0 (``mail handler'') and its descendant \fInmh\fP meillo@23: (``new mail handler''). meillo@47: .[ meillo@47: nmh website meillo@47: .] meillo@23: \s-1MUA\s0s provide functions to read, compose, and organize mail, meillo@45: but (ideally) not to transfer it. meillo@45: In this document, the name \s-1MH\s0 will be used to include nmh. meillo@19: A distinction will only be made if differences between meillo@45: \s-1MH\s0 and nmh are described. meillo@18: meillo@0: meillo@48: .H 2 "Historical background meillo@0: .LP meillo@61: Electronic mail was available in Unix from a very early stage. meillo@30: The first \s-1MUA\s0 on Unix was \f(CWmail\fP, meillo@30: which was already present in the First Edition. meillo@46: .[ [ meillo@44: salus meillo@44: quarter century of unix meillo@46: .], page 41 f.] meillo@45: It was a small program that either printed the user's mailbox file meillo@54: or appended text to someone else's mailbox file, meillo@19: depending on the command line arguments. meillo@19: .[ meillo@44: manual mail(1) meillo@19: .] meillo@19: It was a program that did one job well. meillo@23: This job was emailing, which was very simple then. meillo@19: .PP meillo@23: Later, emailing became more powerful, and thus more complex. meillo@19: The simple \f(CWmail\fP, which knew nothing of subjects, meillo@19: independent handling of single messages, meillo@61: and long-term email storage, was not powerful enough anymore. meillo@61: In 1978 at Berkeley, Kurt Shoens wrote \fIMail\fP (with a capital `M') meillo@45: to provide additional functions for emailing. meillo@61: Mail was still one program, but was large and did several jobs. meillo@61: Its user interface was modeled after \fIed\fP. meillo@61: Ed is designed for humans, but is still scriptable. meillo@61: \fImailx\fP is the adaptation of Berkeley Mail for System V. meillo@19: .[ meillo@44: ritter meillo@44: mailx history meillo@19: .] meillo@61: Elm, pine, mutt, and a slew of graphical \s-1MUA\s0s meillo@61: followed Mail's direction: meillo@61: large, monolithic programs which included all emailing functions. meillo@19: .PP meillo@23: A different way was taken by the people of \s-1RAND\s0 Corporation. meillo@61: Initially, they also had used a monolithic mail system meillo@30: called \s-1MS\s0 (for ``mail system''). meillo@19: But in 1977, Stockton Gaines and Norman Shapiro meillo@61: came up with a proposal for a new email system concept \(en meillo@45: one that honored the Unix Philosophy. meillo@19: The concept was implemented by Bruce Borden in 1978 and 1979. meillo@19: This was the birth of \s-1MH\s0 \(en the ``mail handler''. meillo@18: .PP meillo@18: Since then, \s-1RAND\s0, the University of California at Irvine and meillo@19: at Berkeley, and several others have contributed to the software. meillo@18: However, it's core concepts remained the same. meillo@23: In the late 90s, when development of \s-1MH\s0 slowed down, meillo@19: Richard Coleman started with \fInmh\fP, the new mail handler. meillo@61: His goal was to improve \s-1MH\s0 especially in regard to meillo@23: the requirements of modern emailing. meillo@19: Today, nmh is developed by various people on the Internet. meillo@18: .[ meillo@44: ware meillo@44: rand history meillo@18: .] meillo@18: .[ meillo@44: peek meillo@44: mh meillo@18: .] meillo@0: meillo@48: .H 2 "Contrasts to monolithic mail systems meillo@0: .LP meillo@19: All \s-1MUA\s0s are monolithic, except \s-1MH\s0. meillo@61: Although some very little known toolchest \s-1MUA\s0s might also exist, meillo@61: this statement reflects the situation pretty well. meillo@19: .PP meillo@30: Monolithic \s-1MUA\s0s gather all their functions in one program. meillo@30: In contrast, \s-1MH\s0 is a toolchest of many small tools \(en one for each job. meillo@23: Following is a list of important programs of \s-1MH\s0's toolchest meillo@30: and their function. meillo@61: It gives an indication of what the toolchest looks like. meillo@19: .IP \(bu meillo@19: .CW inc : meillo@30: incorporate new mail (this is how mail enters the system) meillo@19: .IP \(bu meillo@19: .CW scan : meillo@19: list messages in folder meillo@19: .IP \(bu meillo@19: .CW show : meillo@19: show message meillo@19: .IP \(bu meillo@19: .CW next\fR/\fPprev : meillo@19: show next/previous message meillo@19: .IP \(bu meillo@19: .CW folder : meillo@19: change current folder meillo@19: .IP \(bu meillo@19: .CW refile : meillo@45: refile message into different folder meillo@19: .IP \(bu meillo@19: .CW rmm : meillo@19: remove message meillo@19: .IP \(bu meillo@19: .CW comp : meillo@45: compose new message meillo@19: .IP \(bu meillo@19: .CW repl : meillo@45: reply to message meillo@19: .IP \(bu meillo@19: .CW forw : meillo@45: forward message meillo@19: .IP \(bu meillo@19: .CW send : meillo@45: send prepared message (this is how mail leaves the system) meillo@0: .LP meillo@19: \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have. meillo@61: The user does not leave the shell to run \s-1MH\s0; meillo@45: instead he uses the various \s-1MH\s0 programs within the shell. meillo@23: Using a monolithic program with a captive user interface meillo@23: means ``entering'' the program, using it, and ``exiting'' the program. meillo@23: Using toolchests like \s-1MH\s0 means running programs, meillo@45: alone or in combination with others, also from other toolchests, meillo@23: without leaving the shell. meillo@30: meillo@48: .H 2 "Data storage meillo@30: .LP meillo@61: \s-1MH\s0's mail storage consists of a hierarchy under the user's meillo@34: \s-1MH\s0 directory (usually \f(CW$HOME/Mail\fP), meillo@34: where mail folders are directories and mail messages are text files meillo@34: within them. meillo@34: Each mail folder contains a file \f(CW.mh_sequences\fP which lists meillo@45: the public message sequences of that folder, meillo@61: for instance, the \fIunseen\fP sequence for new messages. meillo@34: Mail messages are text files located in a mail folder. meillo@61: The files contain the messages as they were received, meillo@61: and they are named by ascending numbers in each folder. meillo@19: .PP meillo@30: This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0. meillo@30: Alternatives are \fImbox\fP and \fImaildir\fP. meillo@61: In the mbox format, all messages are stored within one file. meillo@30: This was a good solution in the early days, when messages meillo@61: were only a few lines of text deleted within a short period of time. meillo@61: Today, with single messages often including several megabytes meillo@61: of attachments, this is a bad solution. meillo@30: Another disadvantage of the mbox format is that it is meillo@30: more difficult to write tools that work on mail messages, meillo@30: because it is always necessary to first find and extract meillo@30: the relevant message in the mbox file. meillo@45: With the \s-1MH\s0 mailbox format, each message is a separate file. meillo@30: Also, the problem of concurrent access to one mailbox is meillo@30: reduced to the problem of concurrent access to one message. meillo@45: The maildir format is generally similar to the \s-1MH\s0 format, meillo@30: but modified towards guaranteed reliability. meillo@30: This involves some complexity, unfortunately. meillo@34: .PP meillo@34: Working with \s-1MH\s0's toolchest on mailboxes is much like meillo@34: working with Unix' toolchest on directory trees: meillo@34: \f(CWscan\fP is like \f(CWls\fP, meillo@34: \f(CWshow\fP is like \f(CWcat\fP, meillo@34: \f(CWfolder\fP is like \f(CWcd\fP and \f(CWpwd\fP, meillo@34: \f(CWrefile\fP is like \f(CWmv\fP, meillo@34: and \f(CWrmm\fP is like \f(CWrm\fP. meillo@34: .PP meillo@61: \s-1MH\s0 extends the context of processes in Unix by two more items meillo@45: for its tools: meillo@34: .IP \(bu meillo@34: The current mail folder, which is similar to the current working directory. meillo@34: For mail folders, \f(CWfolder\fP provides the corresponding functionality meillo@34: of \f(CWcd\fP and \f(CWpwd\fP for directories. meillo@34: .IP \(bu meillo@34: Sequences, which are named sets of messages in a mail folder. meillo@34: The current message, relative to a mail folder, is a special sequence. meillo@34: It enables commands like \f(CWnext\fP and \f(CWprev\fP. meillo@34: .LP meillo@61: In contrast to the general process context in Unix, meillo@61: which is maintained by the kernel, meillo@45: \s-1MH\s0's context must be maintained by the tools themselves. meillo@45: Usually there is one context per user, which resides in his meillo@45: \f(CWcontext\fP file in the \s-1MH\s0 directory, meillo@45: but a user can have several contexts, too. meillo@45: Public sequences are an exception, as they belong to a mail folder, meillo@45: and reside in the \f(CW.mh_sequences\fP file there. meillo@34: .[ meillo@44: man page mh-profile mh-sequence meillo@34: .] meillo@20: meillo@48: .H 2 "Discussion of the design meillo@0: .LP meillo@45: This section discusses \s-1MH\s0 in regard to the tenets meillo@45: of the Unix Philosophy that Gancarz identified. meillo@20: meillo@20: .PP meillo@33: .B "Small is beautiful meillo@20: and meillo@33: .B "do one thing well meillo@20: are two design goals that are directly visible in \s-1MH\s0. meillo@61: Gancarz actually uses \s-1MH\s0 in his book as example under the meillo@45: headline ``Making \s-1UNIX\s0 Do One Thing Well'': meillo@46: .[ [ meillo@44: gancarz meillo@44: unix philosophy meillo@46: .], page 125 ff.] meillo@20: .QP meillo@20: [\s-1MH\s0] consists of a series of programs which meillo@20: when combined give the user an enormous ability meillo@20: to manipulate electronic mail messages. meillo@20: A complex application, it shows that not only is it meillo@20: possible to build large applications from smaller meillo@20: components, but also that such designs are actually preferable. meillo@20: .LP meillo@45: The various programs of \s-1MH\s0 were relatively easy to write, meillo@61: because each one was small, limited to one function, meillo@61: and had clear boundaries. meillo@61: For the same reasons, they are also easy to maintain. meillo@61: Further more, the system can easily get extended: meillo@61: One only needs to place a new program into the toolchest. meillo@61: This was done when \s-1MIME\s0 support was added meillo@20: (e.g. \f(CWmhbuild\fP). meillo@61: Also, different programs can exist to do basically the same job meillo@20: in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP). meillo@45: .PP meillo@61: If someone needs a mail system with some additional meillo@61: functionality that is not available anywhere yet, meillo@61: it is beneficial to expand a toolchest system like \s-1MH\s0. meillo@45: There he can add new functionality by simply adding additional meillo@61: programs to the toolchest; meillo@61: he does not risk to break existing functionality by doing so. meillo@20: meillo@20: .PP meillo@61: .B "Store data in flat text files" ; meillo@61: this principle was followed by \s-1MH\s0. meillo@34: This is not surprising, because email messages are already plain text. meillo@34: \s-1MH\s0 stores the messages as it receives them, meillo@61: thus any other tool that works on \s-1RFC\s0\|2822 compliant mail meillo@61: messages can operate meillo@34: on the messages in an \s-1MH\s0 mailbox. meillo@61: All other files \s-1MH\s0 uses are plain text as well. meillo@34: It is therefore possible and encouraged to use the text processing meillo@34: tools of Unix' toolchest to extend \s-1MH\s0's toolchest. meillo@20: meillo@20: .PP meillo@33: .B "Avoid captive user interfaces" . meillo@19: \s-1MH\s0 is perfectly suited for non-interactive use. meillo@61: It offers all functions directly, without captive user interfaces. meillo@61: If users want a graphical user interface, meillo@53: they can have it with \fIxmh\fP, \fIexmh\fP, meillo@53: or with the Emacs interface \fImh-e\fP. meillo@53: These are frontends for the \s-1MH\s0 toolchest. meillo@61: This means all email-related work is still done by \s-1MH\s0 tools, meillo@45: but the frontend calls the appropriate commands when the user meillo@53: clicks on buttons or pushes a key. meillo@45: .PP meillo@61: Providing additional user interfaces in form of frontends is a good meillo@19: approach, because it does not limit the power of the backend itself. meillo@61: The frontend will only be able to make a subset of the meillo@61: backend's power and flexibility available to the user, meillo@61: but if it is a separate program, meillo@20: then the missing parts can still be accessed at the backend directly. meillo@61: If it is integrated, then this will be much more difficult. meillo@61: An additional advantage is the ability to have different frontends meillo@45: to the same backend. meillo@19: meillo@19: .PP meillo@33: .B "Choose portability over efficiency meillo@20: and meillo@33: .B "use shell scripts to increase leverage and portability" . meillo@20: These two tenets are indirectly, but nicely, demonstrated by meillo@30: Bolsky and Korn in their book about the Korn Shell. meillo@20: .[ meillo@44: bolsky korn meillo@44: korn shell meillo@20: .] meillo@45: Chapter\|18 of the book shows a basic implementation meillo@20: of a subset of \s-1MH\s0 in ksh scripts. meillo@61: This is just a demonstration, but a brilliant one. meillo@20: It shows how quickly one can implement such a prototype with shell scripts, meillo@20: and how readable they are. meillo@61: The implementation in scripting language may not be very fast, meillo@61: but it can be fast enough, and this is all that matters. meillo@20: By having the code in an interpreted language, like the shell, meillo@61: portability becomes a minor issue if we assume the interpreter meillo@20: to be widespread. meillo@45: .PP meillo@20: This demonstration also shows how easy it is to create single programs meillo@61: of toolchest software. meillo@61: Eight tools (two of them having multiple names) and 16 functions meillo@45: with supporting code are presented to the reader. meillo@45: The tools comprise less than 40 lines of ksh each, meillo@30: in total about 200 lines. meillo@45: The functions comprise less than 80 lines of ksh each, meillo@30: in total about 450 lines. meillo@20: Such small software is easy to write, easy to understand, meillo@20: and thus easy to maintain. meillo@61: A toolchest improves one's ability to only write some parts of a meillo@61: program while still creating a working result. meillo@45: Expanding the toolchest, even without global changes, meillo@45: will likely be possible. meillo@20: meillo@20: .PP meillo@33: .B "Use software leverage to your advantage meillo@20: and the lesser tenet meillo@33: .B "allow the user to tailor the environment meillo@20: are ideally followed in the design of \s-1MH\s0. meillo@21: Tailoring the environment is heavily encouraged by the ability to meillo@30: directly define default options to programs. meillo@30: It is even possible to define different default options meillo@45: depending on the name under which a program is called. meillo@45: Software leverage is heavily encouraged by the ease of meillo@45: creating shell scripts that run a specific command line, meillo@30: built of several \s-1MH\s0 programs. meillo@61: There are few pieces of software that encourages users to tailor their meillo@61: environment and to leverage the use of the software like \s-1MH\s0. meillo@45: .PP meillo@61: Just to cite one example: meillo@23: One might prefer a different listing format for the \f(CWscan\fP meillo@21: program. meillo@30: It is possible to take one of the distributed format files meillo@21: or to write one yourself. meillo@21: To use the format as default for \f(CWscan\fP, a single line, meillo@21: reading meillo@21: .DS meillo@21: scan: -form FORMATFILE meillo@21: .DE meillo@21: must be added to \f(CW.mh_profile\fP. meillo@61: If one wants this alternative format available as an additional command, meillo@61: instead of changing the default, he just needs to create a link to meillo@23: \f(CWscan\fP, for instance titled \f(CWscan2\fP. meillo@21: The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP, meillo@61: as the option should only be in effect for a program that is invoked as meillo@21: \f(CWscan2\fP. meillo@20: meillo@20: .PP meillo@33: .B "Make every program a filter meillo@61: is hard to find implemented in \s-1MH\s0. meillo@61: The reason is that most of \s-1MH\s0's tools provide meillo@45: basic file system operations for mailboxes. meillo@61: It is for the same reason because that \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP, meillo@45: and \f(CWrm\fP aren't filters neither. meillo@61: \s-1MH\s0 does not provide many filters itself, meillo@61: but it provides a basis upon which to write filters. meillo@45: An example would be a mail text highlighter, meillo@61: a program that makes use of a color terminal to display header lines, meillo@61: quotations, and signatures in distinct colors. meillo@45: The author's version of such a program is an awk script with 25 lines. meillo@21: meillo@21: .PP meillo@33: .B "Build a prototype as soon as possible meillo@21: was again well followed by \s-1MH\s0. meillo@61: This tenet, of course, focuses on early development, which is a meillo@21: long time ago for \s-1MH\s0. meillo@21: But without following this guideline at the very beginning, meillo@23: Bruce Borden may have not convinced the management of \s-1RAND\s0 meillo@23: to ever create \s-1MH\s0. meillo@23: In Bruce' own words: meillo@46: .[ [ meillo@44: ware rand history meillo@46: .], page 132] meillo@21: .QP meillo@45: [...] but [Stockton Gaines and Norm Shapiro] were not able meillo@23: to convince anyone that such a system would be fast enough to be usable. meillo@21: I proposed a very short project to prove the basic concepts, meillo@21: and my management agreed. meillo@21: Looking back, I realize that I had been very lucky with my first design. meillo@21: Without nearly enough design work, meillo@21: I built a working environment and some header files meillo@21: with key structures and wrote the first few \s-1MH\s0 commands: meillo@21: inc, show/next/prev, and comp. meillo@21: [...] meillo@21: With these three, I was able to convince people that the structure was viable. meillo@21: This took about three weeks. meillo@0: meillo@48: .H 2 "Problems meillo@0: .LP meillo@61: \s-1MH\s0 is not without its problems. meillo@61: There are two main problems: one is technical, the other pertains to human behavior. meillo@22: .PP meillo@61: \s-1MH\s0 is old and email today is quite different than it was in the time meillo@22: when \s-1MH\s0 was designed. meillo@61: \s-1MH\s0 adapted to the changes fairly well, but it has its limitations. meillo@22: \s-1MIME\s0 support and support for different character encodings meillo@22: is available, but only on a moderate level. meillo@45: This comes from limited development resources. meillo@61: A larger and more active developer base could quickly remedy this. meillo@45: But \s-1MH\s0 is also limited by design, which is the larger problem. meillo@54: \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extent. meillo@61: These design conflicts are not easily solvable meillo@61: and may require a redesign. meillo@61: \s-1IMAP\s0 may be too incompatible with the classic mail model, meillo@61: which \s-1MH\s0 covers, so \s-1MH\s0 may never support it well. meillo@61: (Using \s-1IMAP\s0 and a filesystem abstraction layer to only map meillo@61: a remote directory into the local filesystem, is a different topic. meillo@61: \s-1IMAP\s0 support is seen as being able to access the special meillo@61: mail features of the protocol.) meillo@22: .PP meillo@61: The other kind of problem relates to human habits. meillo@45: In this world, where almost all \s-1MUA\s0s are monolithic, meillo@61: it is very difficult to convince people to use a toolchest-style \s-1MUA\s0 meillo@22: like \s-1MH\s0. meillo@61: These habits are so strong, that even people who understand the concept meillo@61: and advantages of \s-1MH\s0 are reluctant to switch, meillo@30: simply because \s-1MH\s0 is different. meillo@61: Unfortunately, the frontends to \s-1MH\s0, which could provide familiar meillo@61: look and feel, are quite outdated and thus not very appealing in comparison meillo@61: to the modern interfaces of many monolithic \s-1MUA\s0s. meillo@53: One notable exception is \fImh-e\fP which provides an Emacs interface meillo@53: to \s-1MH\s0. meillo@53: \fIMh-e\fP looks much like \fImutt\fP or \fIpine\fP, meillo@53: but it has buttons, menus, and graphical display capabilities. meillo@20: meillo@53: .H 2 "Summary meillo@20: .LP meillo@45: \s-1MH\s0 is an \s-1MUA\s0 that follows the Unix Philosophy in its design. meillo@61: It consists of a toolchest of small tools, each of which does one job well. meillo@31: The toolchest approach offers great flexibility to the user. meillo@45: It is possible to utilize the complete power of the Unix shell with \s-1MH\s0. meillo@61: This makes \s-1MH\s0 a very powerful mail system, meillo@61: and extending and customizing \s-1MH\s0 is easy and encouraged. meillo@31: .PP meillo@31: Apart from the user's perspective, \s-1MH\s0 is development-friendly. meillo@31: Its overall design follows clear rules. meillo@61: The single tools do only one job; thus they are easy to understand, meillo@61: write, and maintain. meillo@31: They are all independent and do not interfere with the others. meillo@61: Automated testing of their function is a straightforward task. meillo@31: .PP meillo@61: It is sad, that \s-1MH\s0's dissimilarity to other \s-1MUA\s0s is its meillo@61: largest problem, as this dissimilarity is also its largest advantage. meillo@61: Unfortunately, most people's habits are stronger meillo@61: than the attraction of the clear design and the power \s-1MH\s0 offers. meillo@0: meillo@8: meillo@8: meillo@48: .H 1 "Case study: uzbl meillo@32: .LP meillo@61: The last chapter focused on the \s-1MUA\s0 \s-1MH\s0, meillo@61: which is an old and established piece of software. meillo@45: This chapter covers uzbl, a fresh new project. meillo@45: Uzbl is a web browser that adheres to the Unix Philosophy. meillo@45: Its name comes from the \fILolspeak\fP word for ``usable''; meillo@61: both are pronounced in the same way. meillo@0: meillo@48: .H 2 "Historical background meillo@0: .LP meillo@32: Uzbl was started by Dieter Plaetinck in April 2009. meillo@61: The idea was born in a thread on the Arch Linux forums. meillo@32: .[ meillo@44: arch linux forums meillo@44: browser meillo@32: .] meillo@61: After some discussion about the failures of well-known web browsers, meillo@61: Plaetinck (alias Dieter@be) came up with a rough proposal meillo@61: of how a better web browser could look. meillo@61: In response to another member who asked if Plaetinck would write this meillo@61: program because it sounded fantastic, Plaetinck replied: meillo@32: ``Maybe, if I find the time ;-)''. meillo@32: .PP meillo@32: Fortunately, he found the time. meillo@32: One day later, the first prototype was out. meillo@61: One week later, uzbl had its own website. meillo@47: .[ meillo@47: uzbl website meillo@47: .] meillo@61: One month after the initial code was presented, meillo@61: a mailing list was set up to coordinate and discuss further development, meillo@61: and a wiki was added to store documentation and scripts that cropped up meillo@61: on the mailing list and elsewhere. meillo@32: .PP meillo@61: In the first year of uzbl's existence, it was heavily developed on various branches. meillo@32: Plaetinck's task became more and more to only merge the best code from the meillo@32: different branches into his main branch, and to apply patches. meillo@47: .[ meillo@47: lwn uzbl meillo@47: .] meillo@32: About once a month, Plaetinck released a new version. meillo@32: In September 2009, he presented several forks of uzbl. meillo@47: .[ [ meillo@47: uzbl website meillo@47: .], news archive] meillo@61: Uzbl actually opened the field for a whole family of web browsers with meillo@61: a similar design. meillo@32: .PP meillo@61: In July 2009, \fILinux Weekly News\fP published an interview with meillo@61: Plaetinck about uzbl. meillo@47: .[ meillo@47: lwn uzbl meillo@47: .] meillo@32: In September 2009, the uzbl web browser was on \fISlashdot\fP. meillo@47: .[ meillo@47: slashdot uzbl meillo@47: .] meillo@0: meillo@48: .H 2 "Contrasts to other web browsers meillo@0: .LP meillo@32: Like most \s-1MUA\s0s are monolithic, but \s-1MH\s0 is a toolchest, meillo@32: most web browsers are monolithic, but uzbl is a frontend to a toolchest. meillo@32: .PP meillo@32: Today, uzbl is divided into uzbl-core and uzbl-browser. meillo@61: Uzbl-core is, as its name indicates, the core of uzbl. meillo@61: It handles commands and events to interface with other programs, meillo@61: and displays webpages by using \fIwebkit\fP as its rendering engine. meillo@61: Uzbl-browser combines uzbl-core with a selection of handler scripts, meillo@61: a status bar, an event manager, yanking, pasting, page searching, meillo@61: zooming, and much more functionality, to form a ``complete'' web browser. meillo@61: In the following text, the term ``uzbl'' usually refers to uzbl-browser, meillo@32: so uzbl-core is included. meillo@32: .PP meillo@61: Unlike most other web browsers, uzbl is mainly the mediator between meillo@45: various tools that cover single jobs. meillo@61: Uzbl listens for commands on a named pipe (fifo), a Unix socket, meillo@35: and on stdin, and it writes events to a Unix socket and to stdout. meillo@35: Loading a webpage in a running uzbl instance requires only: meillo@32: .DS meillo@32: echo 'uri http://example.org' >/path/to/uzbl-fifo meillo@32: .DE meillo@61: The rendering of the webpage is done by libwebkit, meillo@61: around which uzbl-core is built. meillo@32: .PP meillo@45: Downloads, browsing history, bookmarks, and the like are not provided meillo@61: by the core itself like they are in other web browsers. meillo@61: Uzbl-browser also only provides ``handler scripts'' which wrap meillo@61: external applications to provide the actual functionality. meillo@32: For instance, \fIwget\fP is used to download files and uzbl-browser meillo@32: includes a script that calls wget with appropriate options in meillo@32: a prepared environment. meillo@32: .PP meillo@61: Modern web browsers are proud to have addons, plugins, modules, meillo@61: and so forth. meillo@32: This is their effort to achieve similar goals. meillo@61: But instead of using existing external programs, modern web browsers meillo@45: include these functions. meillo@0: meillo@48: .H 2 "Discussion of the design meillo@0: .LP meillo@61: This section discusses uzbl in regard to the Unix Philosophy, meillo@32: as identified by Gancarz. meillo@32: meillo@32: .PP meillo@35: .B "Make each program do one thing well" . meillo@35: Uzbl tries to be a web browser and nothing else. meillo@61: The common definition of a web browser is highly influenced by meillo@61: existing implementations of web browsers. meillo@61: But a web browser should be a program to browse the web, and nothing more. meillo@61: This is the one thing it should do. meillo@36: .PP meillo@61: Web browsers should not, for instance, manage downloads; meillo@61: this is the job of download managers. meillo@61: A download manager is primary concerned with downloading files. meillo@35: Modern web browsers provide download management only as a secondary feature. meillo@61: How could they do this job better than programs that exist only for meillo@35: this very job? meillo@61: And why would anyone want less than the best download manager available? meillo@32: .PP meillo@35: A web browser's job is to let the user browse the web. meillo@35: This means, navigating through websites by following links. meillo@36: Rendering the \s-1HTML\s0 sources is a different job, too. meillo@61: In uzbl's case, this is covered by the webkit rendering engine. meillo@61: Handling audio and video content, PostScript, \s-1PDF\s0, meillo@61: and other such files are also not the job of a web browser. meillo@61: Such content should be handled by external programs meillo@61: that were written to handle such data. meillo@35: Uzbl strives to do it this way. meillo@36: .PP meillo@61: Remember Doug McIlroy's words: meillo@35: .I meillo@35: ``Write programs that do one thing and do it well. meillo@35: Write programs to work together.'' meillo@35: .R meillo@35: .PP meillo@35: The lesser tenet meillo@35: .B "allow the user to tailor the environment meillo@61: applies here as well. meillo@61: Previously, the question, ``Why would anyone want anything less than the meillo@61: best program for the job?'' was put forward. meillo@61: But as personal preferences matter, it might be more important to ask: meillo@61: ``Why would anyone want something other than his preferred program for meillo@61: the job?'' meillo@36: .PP meillo@61: Users typically want one program for a specific job. meillo@61: Hence, whenever one wishes to download something, meillo@45: the same download manager should be used. meillo@61: More advanced users might want to use one download manager in a certain meillo@61: situation and another in a different situation; meillo@61: they should be able to configure it this way. meillo@61: With uzbl, any download manager can be used. meillo@61: To switch to a different one, a single line in a small handler script meillo@35: needs to be changed. meillo@61: Alternatively, it would be possible to query which download manager to use by meillo@61: reading a global file or an environment variable in the handler script. meillo@61: Of course, uzbl can use a different handler script as well. meillo@61: This simply requires a one line change in uzbl's configuration file. meillo@36: .PP meillo@61: Uzbl neither has its own download manager nor depends on a specific one; meillo@61: hence, uzbl's browsing abilities will not be crippled by having meillo@35: a bad download manager. meillo@61: Uzbl's download capabilities will be as good as the best meillo@36: download manager available on the system. meillo@38: Of course, this applies to all of the other supplementary tools, too. meillo@32: meillo@32: .PP meillo@36: .B "Use software leverage to your advantage" . meillo@36: Uzbl is designed to be extended by external tools. meillo@36: These external tools are usually wrapped by small handler shell scripts. meillo@61: Shell scripts form the basis for the glue which holds the various meillo@61: parts together. meillo@36: .PP meillo@45: The history mechanism of uzbl shall be presented as an example. meillo@36: Uzbl is configured to spawn a script to append an entry to the history meillo@36: whenever the event of a fully loaded page occurs. meillo@45: The script to append the entry to the history is not much more than: meillo@36: .DS meillo@36: #!/bin/sh meillo@36: file=/path/to/uzbl-history meillo@36: echo `date +'%Y-%m-%d %H:%M:%S'`" $6 $7" >> $file meillo@36: .DE meillo@61: \f(CW$6\fP and \f(CW$7\fP expand to the \s-1URL\s0 and the page title, meillo@61: respectively. meillo@45: .PP meillo@45: For loading an entry, a key is bound to spawn a load-from-history script. meillo@36: The script reverses the history to have newer entries first, meillo@61: displays \fIdmenu\fP to let the user select an item, meillo@61: and then writes the selected \s-1URL\s0 into uzbl's command input pipe. meillo@45: With error checking and corner case handling removed, meillo@45: the script looks like this: meillo@36: .DS meillo@36: #!/bin/sh meillo@36: file=/path/to/uzbl-history meillo@36: goto=`tac $file | dmenu | cut -d' ' -f 3` meillo@36: echo "uri $goto" > $4 meillo@36: .DE meillo@36: \f(CW$4\fP expands to the path of the command input pipe of the current meillo@36: uzbl instance. meillo@32: meillo@32: .PP meillo@33: .B "Avoid captive user interfaces" . meillo@61: One could say that uzbl, to a large extent, actually \fIis\fP meillo@36: a captive user interface. meillo@61: But the difference from other web browsers is that uzbl is only meillo@45: the captive user interface frontend (and the core of the backend). meillo@38: Many parts of the backend are independent of uzbl. meillo@61: For some external programs, handler scripts are distributed with uzbl; meillo@61: but arbitrary additional functionality can always be added if desired. meillo@37: .PP meillo@37: The frontend is captive \(en that is true. meillo@37: This is okay for the task of browsing the web, as this task is only relevant meillo@61: to humans. meillo@61: Automated programs would \fIcrawl\fP the web, that means, meillo@61: read the source directly, including all semantics. meillo@61: The graphical representation is just for humans to understand the semantics meillo@37: more intuitively. meillo@32: meillo@32: .PP meillo@33: .B "Make every program a filter" . meillo@37: Graphical web browsers are almost dead ends in the chain of information flow. meillo@37: Thus it is difficult to see what graphical web browsers should filter. meillo@61: Graphical web browsers exist almost exclusively to be interactively used meillo@61: by humans. meillo@61: The only case in which one might want to automate the rendering function is meillo@37: to generate images of rendered webpages. meillo@37: meillo@37: .PP meillo@37: .B "Small is beautiful" meillo@61: is not easy to apply to a web browser because modern web technology meillo@61: is very complex; hence, the rendering task is very complex. meillo@61: Unfortunately, modern web browsers ``have'' to consist of many thousand meillo@61: lines of code, meillo@61: Using the toolchest approach and wrappers can help to split the browser meillo@61: into several small parts, though. meillo@37: .PP meillo@45: As of March 2010, uzbl-core consists of about 3\,500 lines of C code. meillo@37: The distribution includes another 3\,500 lines of Shell and Python code, meillo@61: which are the handler scripts and plugins like one to provide a modal meillo@61: interface. meillo@61: Further more, uzbl makes use of external tools like meillo@54: \fIwget\fP and \fIsocat\fP. meillo@37: Up to this point, uzbl looks pretty neat and small. meillo@61: The ugly part of uzbl is the rendering engine, webkit. meillo@37: Webkit consists of roughly 400\,000 (!) lines of code. meillo@61: Unfortunately, small rendering engines are not feasible anymore meillo@61: due to the nature of the modern web. meillo@35: meillo@35: .PP meillo@35: .B "Build a prototype as soon as possible" . meillo@61: Plaetinck made his code public right from the beginning. meillo@61: Discussion and development was, and still is, open to everyone interested, meillo@61: and development versions of uzbl can be obtained very easily from the meillo@61: code repository. meillo@38: Within the first year of uzbl's existence, a new version was released meillo@35: more often than once a month. meillo@61: Different forks and branches arose introducing new features which were meillo@61: then considered for merging into the main branch. meillo@61: The experiences with using prototypes influenced further development. meillo@35: Actually, all development was community driven. meillo@38: Plaetinck says, three months after uzbl's birth: meillo@35: ``Right now I hardly code anything myself for Uzbl. meillo@35: I just merge in other people's code, ponder a lot, and lead the discussions.'' meillo@35: .[ meillo@44: lwn meillo@44: uzbl meillo@35: .] meillo@32: meillo@0: meillo@48: .H 2 "Problems meillo@0: .LP meillo@61: Similar to \s-1MH\s0, uzbl suffers from being different. meillo@38: It is sad, but people use what they know. meillo@61: Fortunately, uzbl's user interface can be made to look and feel very similar meillo@61: to the one of the well known web browsers, meillo@38: hiding the internal differences. meillo@38: But uzbl has to provide this similar look and feel to be accepted meillo@38: as a ``normal'' browser by ``normal'' users. meillo@37: .PP meillo@61: The more important problem here is the modern web. meillo@38: The modern web is simply broken. meillo@61: It has state in a state-less protocol, misuses technologies, meillo@61: and is helplessly overloaded. meillo@61: This results in rendering engines that ``must'' consist meillo@61: of hundreds of thousands of lines of code. meillo@61: They also must combine and integrate many different technologies meillo@61: to make our modern web accessible. meillo@61: This results, however, in a failing attempt to provide good usability. meillo@61: Website-to-image converters are almost impossible to run without meillo@38: human interaction because of state in sessions, impossible meillo@61: deep-linking, and ``unautomatable'' technologies. meillo@37: .PP meillo@61: The web was misused in order to attempt to fulfill all kinds of wishes. meillo@61: Now web browsers, and ultimately users, suffer from it. meillo@37: meillo@8: meillo@51: .H 2 "Summary meillo@32: .LP meillo@38: ``Uzbl is a browser that adheres to the Unix Philosophy'', meillo@61: is how uzbl is seen by its authors. meillo@38: Indeed, uzbl follows the Unix Philosophy in many ways. meillo@38: It consists of independent parts that work together, meillo@45: while its core is mainly a mediator which glues the parts together. meillo@38: .PP meillo@61: Software leverage is put to excellent use. meillo@61: External tools are used, independent tasks are separated out meillo@61: to independent parts and glued together with small handler scripts. meillo@38: .PP meillo@61: Since uzbl roughly consists of a set of tools and a bit of glue, meillo@61: anyone can put the parts together and expand it in any desired way. meillo@61: Flexibility and customization are properties that make it valuable meillo@61: for advanced users, but may keep novice users from understanding meillo@61: and using it. meillo@38: .PP meillo@61: But uzbl's main problem is the modern web, which makes it very difficult meillo@38: to design a sane web browser. meillo@38: Despite this bad situation, uzbl does a fairly good job. meillo@32: meillo@8: meillo@48: .H 1 "Final thoughts meillo@0: meillo@0: .LP meillo@50: This paper explained why good design is important. meillo@61: It introduced the Unix Philosophy as a set of guidelines that encourage meillo@61: good design in order to create good quality software. meillo@61: Then, real world software that was designed with the Unix Philosophy meillo@61: in mind was discussed. meillo@50: .PP meillo@61: Throughout this paper, the aim was do explain \fIwhy\fP something meillo@50: should be done the Unix way. meillo@61: Reasons were given to substantiate the claim that the Unix Philosophy meillo@61: is a preferable way of designing software. meillo@50: .PP meillo@50: The Unix Philosophy is close to the software developer's point of view. meillo@61: Its main goal is taming the beast known as ``software complexity''. meillo@61: Hence it strives first and foremost for simplicity of software. meillo@61: It might appear that usability for humans is a minor goal, meillo@61: but actually, the Unix Philosophy sees usability as a result of sound design. meillo@61: Sound design does not need to be ultimately intuitive, meillo@50: but it will provide a consistent way to access the enormous power meillo@50: of software leverage. meillo@50: .PP meillo@61: Being able to solve some specific concrete problem becomes less and less meillo@61: important as there is software available for nearly every possible task meillo@61: today. meillo@50: But the quality of software matters. meillo@50: It is important that we have \fIgood\fP software. meillo@50: .sp meillo@0: .LP meillo@50: .B "But why the Unix Philosophy? meillo@50: .PP meillo@50: The largest problem of software development is the complexity involved. meillo@50: It is the only part of the job that computers cannot take over. meillo@61: The Unix Philosophy fights complexity, as it is the main enemy. meillo@50: .PP meillo@50: On the other hand, meillo@61: the most unique advantage of software is its ability to leverage. meillo@50: Current software still fails to make the best possible use of this ability. meillo@61: The Unix Philosophy concentrates on exploiting this great opportunity. meillo@0: meillo@47: meillo@47: .bp meillo@47: .TL meillo@47: References meillo@47: .LP meillo@47: .XS meillo@47: .sp .5v meillo@47: .B meillo@47: References meillo@47: .XE meillo@47: .ev r meillo@42: .nr PS -1 meillo@42: .nr VS -1 meillo@0: .[ meillo@0: $LIST$ meillo@0: .] meillo@47: .nr PS +1 meillo@47: .nr VS +1 meillo@47: .ev meillo@47: meillo@42: .bp meillo@47: .TL meillo@47: Table of Contents meillo@47: .LP meillo@47: .PX no