meillo@49: .so style
meillo@42: 
meillo@0: .TL
meillo@42: .ps +4
meillo@6: Why the Unix Philosophy still matters
meillo@0: .AU
meillo@0: markus schnalke <meillo@marmaro.de>
meillo@0: .AB
meillo@1: .ti \n(.iu
meillo@39: This paper explains the importance of the Unix Philosophy for software design.
meillo@0: Today, few software designers are aware of these concepts,
meillo@39: and thus a lot of modern software is more limited than necessary
meillo@39: and makes less use of software leverage than possible.
meillo@38: Knowing and following the guidelines of the Unix Philosophy makes software more valuable.
meillo@0: .AE
meillo@0: 
meillo@2: .FS
meillo@2: .ps -1
meillo@39: This paper was prepared for the ``Software Analysis'' seminar at University Ulm.
meillo@47: Mentor was professor Franz Schweiggert.
meillo@55: Handed in on 2010-04-16.
meillo@39: You may retrieve this document from
meillo@39: .CW \s-1http://marmaro.de/docs \ .
meillo@2: .FE
meillo@2: 
meillo@48: .H 1 Introduction
meillo@0: .LP
meillo@40: The Unix Philosophy is the essence of how the Unix operating system,
meillo@40: especially its toolchest, was designed.
meillo@57: It is not a limited set of fixed rules,
meillo@40: but a loose set of guidelines which tell how to write software that
meillo@57: suites Unix well.
meillo@57: Actually, the Unix Philosophy describes what is common in typical Unix software.
meillo@40: The Wikipedia has an accurate definition:
meillo@40: .[
meillo@44: wikipedia
meillo@44: unix philosophy
meillo@40: .]
meillo@40: .QP
meillo@40: The \fIUnix philosophy\fP is a set of cultural norms and philosophical
meillo@40: approaches to developing software based on the experience of leading
meillo@40: developers of the Unix operating system.
meillo@1: .PP
meillo@40: As there is no single definition of the Unix Philosophy,
meillo@40: several people have stated their view on what it comprises.
meillo@1: Best known are:
meillo@1: .IP \(bu
meillo@1: Doug McIlroy's summary: ``Write programs that do one thing and do it well.''
meillo@1: .[
meillo@44: mahoney
meillo@44: oral history
meillo@1: .]
meillo@1: .IP \(bu
meillo@1: Mike Gancarz' book ``The UNIX Philosophy''.
meillo@1: .[
meillo@44: gancarz
meillo@44: unix philosophy
meillo@1: .]
meillo@1: .IP \(bu
meillo@1: Eric S. Raymond's book ``The Art of UNIX Programming''.
meillo@1: .[
meillo@44: raymond
meillo@44: art of unix programming
meillo@1: .]
meillo@0: .LP
meillo@1: These different views on the Unix Philosophy have much in common.
meillo@40: Especially, the main concepts are similar in all of them.
meillo@40: McIlroy's definition can surely be called the core of the Unix Philosophy,
meillo@57: but the fundamental idea behind it all is ``small is beautiful''.
meillo@40: 
meillo@40: .PP
meillo@45: The Unix Philosophy explains how to design good software for Unix.
meillo@57: Many concepts described here are based on Unix facilities.
meillo@40: Other operating systems may not offer such facilities,
meillo@57: hence it may not be possible to design software for such systems
meillo@57: according to the Unix Philosophy.
meillo@40: .PP
meillo@57: The Unix Philosophy has an idea of what the process of software development
meillo@41: should look like, but large parts of the philosophy are quite independent
meillo@45: from a concrete development process.
meillo@41: However, one will soon recognize that some development processes work well
meillo@41: with the ideas of the Unix Philosophy and support them, while others are
meillo@41: at cross-purposes.
meillo@45: Kent Beck's books about Extreme Programming are valuable supplemental
meillo@45: resources on this topic.
meillo@1: .PP
meillo@57: The question of how to actually write code and how the code should look
meillo@57: in detail, are beyond the scope of this paper.
meillo@57: Kernighan and Pike's book ``The Practice of Programming''
meillo@41: .[
meillo@44: kernighan pike
meillo@44: practice of programming
meillo@41: .]
meillo@57: covers this topic.
meillo@57: Its point of view corresponds to the one espoused in this paper.
meillo@0: 
meillo@48: .H 1 "Importance of software design in general
meillo@0: .LP
meillo@57: Software design consists of planning how the internal structure
meillo@57: and external interfaces of software should look.
meillo@39: It has nothing to do with visual appearance.
meillo@57: If we were to compare a program to a car, then its color would not matter.
meillo@39: Its design would be the car's size, its shape, the locations of doors,
meillo@45: the passenger/space ratio, the available controls and instruments,
meillo@45: and so forth.
meillo@39: .PP
meillo@57: Why should software be designed at all?
meillo@57: It is accepted as general knowledge,
meillo@57: that even a bad plan is better than no plan.
meillo@57: Not designing software means programming without a plan.
meillo@57: This will surely lead to horrible results,
meillo@57: being horrible to use and horrible to maintain.
meillo@39: These two aspects are the visible ones.
meillo@45: Often invisible though, are the wasted possible gains.
meillo@39: Good software design can make these gains available.
meillo@2: .PP
meillo@57: A software's design deals with qualitative properties.
meillo@39: Good design leads to good quality, and quality is important.
meillo@57: Any car may be able to drive from point A to point B,
meillo@57: but it depends on the qualitative decisions made in the design of the vehicle,
meillo@57: whether it is a good choice for passenger transport or not,
meillo@57: whether it is a good choice for a rough mountain area,
meillo@57: and whether the ride will be fun.
meillo@39: 
meillo@2: .PP
meillo@57: Requirements for a piece of software are twofold:
meillo@39: functional and non-functional.
meillo@39: .IP \(bu
meillo@57: Functional requirements directly define the software's functions.
meillo@39: They are the reason why software gets written.
meillo@39: Someone has a problem and needs a tool to solve it.
meillo@39: Being able to solve the problem is the main functional goal.
meillo@57: This is the driving force behind all programming effort.
meillo@39: Functional requirements are easier to define and to verify.
meillo@39: .IP \(bu
meillo@45: Non-functional requirements are called \fIquality\fP requirements, too.
meillo@57: The quality of software shows through the properties that are not directly
meillo@57: related to the software's basic functions.
meillo@45: Tools of bad quality often do solve the problems they were written for,
meillo@57: but introduce problems and difficulties for usage and development later on.
meillo@57: Qualitative aspects are often overlooked at first sight,
meillo@45: and are often difficult to define clearly and to verify.
meillo@2: .PP
meillo@54: Quality is hardly interesting when software gets built initially,
meillo@57: but it has a high impact on usability and maintenance of the software later.
meillo@57: A short-sighted person might see the process of developing software as
meillo@57: one mainly concerned with building something up.
meillo@57: But, experience shows that building software the first time is
meillo@57: only a small portion of the overall work involved.
meillo@45: Bug fixing, extending, rebuilding of parts \(en maintenance work \(en
meillo@57: soon take a large part of the time spent on a software project.
meillo@45: And of course, the time spent actually using the software.
meillo@6: These processes are highly influenced by the software's quality.
meillo@39: Thus, quality must not be neglected.
meillo@45: However, the problem with quality is that you hardly ``stumble over''
meillo@39: bad quality during the first build,
meillo@45: although this is the time when you should care about good quality most.
meillo@6: .PP
meillo@54: Software design has little to do with the basic function of software \(en
meillo@39: this requirement will get satisfied anyway.
meillo@57: Software design is more about quality aspects.
meillo@39: Good design leads to good quality, bad design to bad quality.
meillo@54: The primary functions of software will be affected modestly by bad quality,
meillo@57: but good quality can provide a lot of additional benefits,
meillo@57: even at places one never expected it.
meillo@6: .PP
meillo@45: The ISO/IEC\|9126-1 standard, part\|1,
meillo@6: .[
meillo@44: iso product quality
meillo@6: .]
meillo@57: defines the quality model as consisting of:
meillo@6: .IP \(bu
meillo@6: .I Functionality
meillo@6: (suitability, accuracy, inter\%operability, security)
meillo@6: .IP \(bu
meillo@6: .I Reliability
meillo@6: (maturity, fault tolerance, recoverability)
meillo@6: .IP \(bu
meillo@6: .I Usability
meillo@6: (understandability, learnability, operability, attractiveness)
meillo@6: .IP \(bu
meillo@6: .I Efficiency
meillo@9: (time behavior, resource utilization)
meillo@6: .IP \(bu
meillo@6: .I Maintainability
meillo@23: (analyzability, changeability, stability, testability)
meillo@6: .IP \(bu
meillo@6: .I Portability
meillo@6: (adaptability, installability, co-existence, replaceability)
meillo@6: .LP
meillo@57: Good design can improve these properties in software;
meillo@57: poorly designed software likely suffers in these areas.
meillo@7: .PP
meillo@7: One further goal of software design is consistency.
meillo@57: Consistency eases understanding, using, and working on things.
meillo@57: Consistent internal structure and consistent external interfaces
meillo@39: can be provided by good design.
meillo@7: .PP
meillo@39: Software should be well designed because good design avoids many
meillo@57: problems during its lifetime.
meillo@57: Also, because good design can offer much additional gain.
meillo@57: Indeed, much effort should be spent on good design to make software more valuable.
meillo@57: The Unix Philosophy provides a way to design software well.
meillo@7: It offers guidelines to achieve good quality and high gain for the effort spent.
meillo@0: 
meillo@0: 
meillo@48: .H 1 "The Unix Philosophy
meillo@4: .LP
meillo@61: The origins of the Unix Philosophy have already been introduced.
meillo@8: This chapter explains the philosophy, oriented on Gancarz,
meillo@55: .[
meillo@55: gancarz
meillo@55: unix philosophy
meillo@55: .]
meillo@8: and shows concrete examples of its application.
meillo@5: 
meillo@48: .H 2 Pipes
meillo@4: .LP
meillo@61: The following examples demonstrate how the Unix Philosophy is applied.
meillo@4: Knowledge of using the Unix shell is assumed.
meillo@4: .PP
meillo@4: Counting the number of files in the current directory:
meillo@41: .DS
meillo@4: ls | wc -l
meillo@4: .DE
meillo@4: The
meillo@4: .CW ls
meillo@4: command lists all files in the current directory, one per line,
meillo@4: and
meillo@4: .CW "wc -l
meillo@8: counts the number of lines.
meillo@4: .PP
meillo@8: Counting the number of files that do not contain ``foo'' in their name:
meillo@41: .DS
meillo@4: ls | grep -v foo | wc -l
meillo@4: .DE
meillo@4: Here, the list of files is filtered by
meillo@4: .CW grep
meillo@45: to remove all lines that contain ``foo''.
meillo@45: The rest equals the previous example.
meillo@4: .PP
meillo@61: Finding the five largest entries in the current directory:
meillo@41: .DS
meillo@4: du -s * | sort -nr | sed 5q
meillo@4: .DE
meillo@4: .CW "du -s *
meillo@45: returns the recursively summed sizes of all files in the current directory
meillo@8: \(en no matter if they are regular files or directories.
meillo@4: .CW "sort -nr
meillo@45: sorts the list numerically in reverse order (descending).
meillo@4: Finally,
meillo@4: .CW "sed 5q
meillo@4: quits after it has printed the fifth line.
meillo@4: .PP
meillo@4: The presented command lines are examples of what Unix people would use
meillo@4: to get the desired output.
meillo@61: There are other ways to get the same output;
meillo@61: it is the user's decision which way to go.
meillo@14: .PP
meillo@8: The examples show that many tasks on a Unix system
meillo@4: are accomplished by combining several small programs.
meillo@61: The connection between the programs is denoted by the pipe operator `|'.
meillo@4: .PP
meillo@4: Pipes, and their extensive and easy use, are one of the great
meillo@4: achievements of the Unix system.
meillo@61: Pipes were possible in earlier operating systems,
meillo@61: but never before have they been such a central part of the concept.
meillo@61: In the early seventies when Doug McIlroy introduced pipes into the
meillo@4: Unix system,
meillo@4: ``it was this concept and notation for linking several programs together
meillo@4: that transformed Unix from a basic file-sharing system to an entirely new way of computing.''
meillo@4: .[
meillo@44: aughenbaugh
meillo@44: unix oral history
meillo@45: .]
meillo@4: .PP
meillo@4: Being able to specify pipelines in an easy way is,
meillo@61: however, not enough by itself;
meillo@61: it is only one half.
meillo@4: The other is the design of the programs that are used in the pipeline.
meillo@61: They need interfaces that allow them to be used in this way.
meillo@5: 
meillo@48: .H 2 "Interface design
meillo@5: .LP
meillo@61: Unix is, first of all, simple \(en everything is a file.
meillo@5: Files are sequences of bytes, without any special structure.
meillo@45: Programs should be filters, which read a stream of bytes from standard input (stdin)
meillo@45: and write a stream of bytes to standard output (stdout).
meillo@8: If the files \fIare\fP sequences of bytes,
meillo@8: and the programs \fIare\fP filters on byte streams,
meillo@45: then there is exactly one data interface.
meillo@45: Hence it is possible to combine programs in any desired way.
meillo@5: .PP
meillo@45: Even a handful of small programs yields a large set of combinations,
meillo@5: and thus a large set of different functions.
meillo@5: This is leverage!
meillo@5: If the programs are orthogonal to each other \(en the best case \(en
meillo@5: then the set of different functions is greatest.
meillo@5: .PP
meillo@61: Programs can also have a separate control interface
meillo@61: in addition to their data interface.
meillo@61: The control interface is often called the ``user interface'',
meillo@11: because it is usually designed to be used by humans.
meillo@61: The Unix Philosophy discourages the assumption that the user will be human.
meillo@11: Interactive use of software is slow use of software,
meillo@11: because the program waits for user input most of the time.
meillo@61: Interactive software also requires the user to be in front of the computer,
meillo@61: occupying his attention during usage.
meillo@11: .PP
meillo@61: Now, back to the idea of combining several small programs
meillo@61: to perform a more specific function:
meillo@61: If these single tools were all interactive,
meillo@11: how would the user control them?
meillo@61: It is not only a problem to control several programs at once
meillo@61: if they run at the same time;
meillo@61: it is also very inefficient to have to control each program
meillo@61: when they are intended to act in concert.
meillo@61: Hence, the Unix Philosophy discourages designing programs which demand
meillo@61: interactive use.
meillo@11: The behavior of programs should be defined at invocation.
meillo@45: This is done by specifying arguments to the program call
meillo@45: (command line switches).
meillo@61: Gancarz discusses this topic as ``avoid[ing] captive user interfaces''.
meillo@46: .[ [
meillo@44: gancarz unix philosophy
meillo@46: .], page 88 ff.]
meillo@11: .PP
meillo@61: Non-interactive use is also an advantage for testing during development.
meillo@61: Testing interactive programs is much more complicated
meillo@61: than testing non-interactive counterparts.
meillo@5: 
meillo@48: .H 2 "The toolchest approach
meillo@5: .LP
meillo@5: A toolchest is a set of tools.
meillo@61: Instead of one big tool for all tasks, there are many small tools,
meillo@5: each for one task.
meillo@61: Difficult tasks are solved by combining several small, simple tools.
meillo@5: .PP
meillo@11: The Unix toolchest \fIis\fP a set of small, (mostly) non-interactive programs
meillo@11: that are filters on byte streams.
meillo@54: They are, to a large extent, unrelated in their function.
meillo@11: Hence, the Unix toolchest provides a large set of functions
meillo@11: that can be accessed by combining the programs in the desired way.
meillo@11: .PP
meillo@61: The act of software development benefits from small toolchest programs, too.
meillo@61: Writing small programs is generally easier and less error-prone
meillo@61: than writing large programs.
meillo@61: Hence, writing a large set of small programs is still easier and
meillo@61: less error-prone than writing one large program with all the
meillo@61: functionality included.
meillo@61: If the small programs are combinable, then they offer even an even larger set
meillo@61: of functions than the single monolithic program.
meillo@45: Hence, one gets two advantages out of writing small, combinable programs:
meillo@45: They are easier to write and they offer a greater set of functions through
meillo@45: combination.
meillo@5: .PP
meillo@61: There are also two main drawbacks of the toolchest approach.
meillo@45: First, one simple, standardized interface has to be sufficient.
meillo@5: If one feels the need for more ``logic'' than a stream of bytes,
meillo@61: then a different approach might be required.
meillo@61: Also, a design where a stream of bytes is sufficient,
meillo@61: might not be conceivable.
meillo@8: By becoming more familiar with the ``Unix style of thinking'',
meillo@8: developers will more often and easier find simple designs where
meillo@8: a stream of bytes is a sufficient interface.
meillo@8: .PP
meillo@61: The second drawback of the toolchest approach concerns the users.
meillo@61: A toolchest is often more difficult to use because
meillo@61: it is necessary to become familiar with each tool and
meillo@61: be able to choose and use the right one in any given situation.
meillo@61: Additionally, one needs to know how to combine the tools in a sensible way.
meillo@61: The issue is similar to having a sharp knife \(en
meillo@61: it is a powerful tool in the hand of a master,
meillo@61: but of no value in the hand of an unskilled person.
meillo@61: However, learning single, small tools of a toolchest is often easier than
meillo@45: learning a complex tool.
meillo@61: The user will already have a basic understanding of an as yet unknown tool
meillo@45: if the tools of a toolchest have a common, consistent style.
meillo@61: He will be able to transfer knowledge of one tool to another.
meillo@5: .PP
meillo@61: This second drawback can be removed to a large extent
meillo@45: by adding wrappers around the basic tools.
meillo@61: Novice users do not need to learn several tools if a professional wraps
meillo@45: complete command lines into a higher-level script.
meillo@5: Note that the wrapper script still calls the small tools;
meillo@45: it is just like a skin around them.
meillo@61: No complexity is added this way,
meillo@61: but new programs can be created out of existing one with very little effort.
meillo@5: .PP
meillo@5: A wrapper script for finding the five largest entries in the current directory
meillo@61: might look like this:
meillo@41: .DS
meillo@5: #!/bin/sh
meillo@5: du -s * | sort -nr | sed 5q
meillo@5: .DE
meillo@61: The script itself is just a text file that calls the commands
meillo@61: that a professional user would type in directly.
meillo@61: It is probably beneficial to make the program flexible in regard to
meillo@61: the number of entries it prints:
meillo@41: .DS
meillo@8: #!/bin/sh
meillo@8: num=5
meillo@8: [ $# -eq 1 ] && num="$1"
meillo@8: du -sh * | sort -nr | sed "${num}q"
meillo@8: .DE
meillo@61: This script acts like the one before when called without an argument,
meillo@61: but the user can also specify a numerical argument to define the number
meillo@61: of lines to print.
meillo@61: One can surely imagine even more flexible versions;
meillo@61: however, they will still rely on the external programs
meillo@61: which actually do the work.
meillo@5: 
meillo@48: .H 2 "A powerful shell
meillo@8: .LP
meillo@61: The Unix shell provides the ability to combine small programs into large ones.
meillo@61: But a powerful shell is a great feature in other ways, too;
meillo@61: for instance, by being scriptable.
meillo@61: Control statements are built into the shell
meillo@61: and the functions are the normal programs of the system.
meillo@61: As the programs are already known,
meillo@45: learning to program in the shell becomes easy.
meillo@8: Using normal programs as functions in the shell programming language
meillo@10: is only possible because they are small and combinable tools in a toolchest style.
meillo@8: .PP
meillo@61: The Unix shell encourages writing small scripts,
meillo@61: by combining existing programs because it is so easy to do.
meillo@8: This is a great step towards automation.
meillo@8: It is wonderful if the effort to automate a task equals the effort
meillo@45: to do the task a second time by hand.
meillo@45: If this holds,
meillo@45: then the user will be happy to automate everything he does more than once.
meillo@8: .PP
meillo@8: Small programs that do one job well, standardized interfaces between them,
meillo@61: a mechanism to combine parts to larger parts, and an easy way to automate tasks
meillo@61: will inevitably produce software leverage,
meillo@61: achieving multiple times the benefit of the initial investment.
meillo@10: .PP
meillo@10: The shell also encourages rapid prototyping.
meillo@10: Many well known programs started as quickly hacked shell scripts,
meillo@61: and turned into ``real'' programs later written in C.
meillo@61: Building a prototype first is a way to avoid the biggest problems
meillo@10: in application development.
meillo@45: Fred Brooks explains in ``No Silver Bullet'':
meillo@10: .[
meillo@44: brooks
meillo@44: no silver bullet
meillo@10: .]
meillo@10: .QP
meillo@10: The hardest single part of building a software system is deciding precisely what to build.
meillo@10: No other part of the conceptual work is so difficult as establishing the detailed
meillo@10: technical requirements, [...].
meillo@10: No other part of the work so cripples the resulting system if done wrong.
meillo@10: No other part is more difficult to rectify later.
meillo@10: .PP
meillo@45: Writing a prototype is a great method for becoming familiar with the requirements
meillo@45: and to run into real problems early.
meillo@47: .[ [
meillo@47: gancarz
meillo@47: unix philosophy
meillo@47: .], page 28 f.]
meillo@45: .PP
meillo@54: Prototyping is often seen as a first step in building software.
meillo@10: This is, of course, good.
meillo@10: However, the Unix Philosophy has an \fIadditional\fP perspective on prototyping:
meillo@61: After having built the prototype, one might notice that the prototype is already
meillo@10: \fIgood enough\fP.
meillo@61: Hence, no reimplementation in a more sophisticated programming language
meillo@45: might be of need, at least for the moment.
meillo@23: Maybe later, it might be necessary to rewrite the software, but not now.
meillo@45: By delaying further work, one keeps the flexibility to react on
meillo@10: changing requirements.
meillo@10: Software parts that are not written will not miss the requirements.
meillo@61: Well known is Gordon Bell's classic saying:
meillo@61: ``The cheapest, fastest, and most reliable components are those
meillo@61: that aren't there.''
meillo@61: .\" FIXME: ref?
meillo@10: 
meillo@48: .H 2 "Worse is better
meillo@10: .LP
meillo@45: The Unix Philosophy aims for the 90% solution;
meillo@10: others call it the ``Worse is better'' approach.
meillo@47: Experience from real life projects shows:
meillo@10: .PP
meillo@61: (1) It is almost impossible to define the
meillo@10: requirements completely and correctly the first time.
meillo@45: Hence one should not try to; one will fail anyway.
meillo@45: .PP
meillo@45: (2) Requirements change during time.
meillo@10: Hence it is best to delay requirement-based design decisions as long as possible.
meillo@61: Software should be small and flexible as long as possible in order
meillo@61: to react to changing requirements.
meillo@61: Shell scripts, for example, are more easily adjusted than C programs.
meillo@45: .PP
meillo@45: (3) Maintenance work is hard work.
meillo@45: Hence, one should keep the amount of code as small as possible;
meillo@61: it should only fulfill the \fIcurrent\fP requirements.
meillo@61: Software parts that will be written in the future
meillo@61: do not need maintenance until that time.
meillo@10: .PP
meillo@47: See Brooks' ``The Mythical Man-Month'' for reference.
meillo@47: .[ [
meillo@47: brooks
meillo@47: mythical man-month
meillo@47: .], page 115 ff.]
meillo@47: .PP
meillo@10: Starting with a prototype in a scripting language has several advantages:
meillo@10: .IP \(bu
meillo@10: As the initial effort is low, one will likely start right away.
meillo@10: .IP \(bu
meillo@61: Real requirements can be identified quickly since working parts are
meillo@61: available sooner.
meillo@10: .IP \(bu
meillo@54: When software is usable and valuable, it gets used, and thus tested.
meillo@61: This ensures that problems will be found in the early stages of development.
meillo@10: .IP \(bu
meillo@61: The prototype might be enough for the moment;
meillo@61: thus, further work can be delayed until a time
meillo@61: when one knows about the requirements and problems more thoroughly.
meillo@10: .IP \(bu
meillo@61: Implementing only the parts that are actually needed at the moment
meillo@61: introduces less programming and maintenance work.
meillo@10: .IP \(bu
meillo@61: If the situation changes such that the software is not needed anymore,
meillo@61: then less effort was spent on the project than it would have been
meillo@61: if a different approach had been taken.
meillo@10: 
meillo@48: .H 2 "Upgrowth and survival of software
meillo@11: .LP
meillo@61: So far, \fIwriting\fP or \fIbuilding\fP software has been discussed.
meillo@61: Although ``writing'' and ``building'' are just verbs,
meillo@61: they do imply a specific view on the work process they describe.
meillo@61: A better verb would be to \fI``grow''\fP.
meillo@12: Creating software in the sense of the Unix Philosophy is an incremental process.
meillo@61: It starts with an initial prototype, which evolves as requirements change.
meillo@12: A quickly hacked shell script might become a large, sophisticated,
meillo@13: compiled program this way.
meillo@13: Its lifetime begins with the initial prototype and ends when the software is not used anymore.
meillo@61: While alive, it will be extended, rearranged, rebuilt.
meillo@12: Growing software matches the view that ``software is never finished. It is only released.''
meillo@46: .[ [
meillo@44: gancarz
meillo@44: unix philosophy
meillo@46: .], page 26]
meillo@12: .PP
meillo@13: Software can be seen as being controlled by evolutionary processes.
meillo@13: Successful software is software that is used by many for a long time.
meillo@61: This implies that the software is necessary, useful, and better than the alternatives.
meillo@61: Darwin describes ``the survival of the fittest.''
meillo@12: .[
meillo@44: darwin
meillo@44: origin of species
meillo@12: .]
meillo@61: In relation to software, the most successful software is the fittest;
meillo@61: the one that survives.
meillo@13: (This may be at the level of one creature, or at the level of one species.)
meillo@13: The fitness of software is affected mainly by four properties:
meillo@15: portability of code, portability of data, range of usability, and reusability of parts.
meillo@13: .PP
meillo@15: (1)
meillo@61: .I "``Portability of code''
meillo@61: means using high-level programming languages,
meillo@13: sticking to the standard,
meillo@47: .[ [
meillo@47: kernighan pike
meillo@47: practice of programming
meillo@47: .], chapter\|8]
meillo@13: and avoiding optimizations that introduce dependencies on specific hardware.
meillo@61: Hardware has a much shorter lifespan than software.
meillo@61: By chaining software to specific hardware,
meillo@61: its lifetime is limited to that of this hardware.
meillo@13: In contrast, software should be easy to port \(en
meillo@23: adaptation is the key to success.
meillo@13: .PP
meillo@15: (2)
meillo@61: .I "``Portability of data''
meillo@15: is best achieved by avoiding binary representations
meillo@61: to store data, since binary representations differ from machine to machine.
meillo@23: Textual representation is favored.
meillo@61: Historically, \s-1ASCII\s0 was the character set of choice;
meillo@61: for the future, \s-1UTF\s0-8 might be the better way forward.
meillo@13: Important is that it is a plain text representation in a
meillo@61: very common character set encoding.
meillo@13: Apart from being able to transfer data between machines,
meillo@61: readable data has the great advantage that humans are able to directly
meillo@45: read and edit it with text editors and other tools from the Unix toolchest.
meillo@47: .[ [
meillo@47: gancarz
meillo@47: unix philosophy
meillo@47: .], page 56 ff.]
meillo@13: .PP
meillo@15: (3)
meillo@15: A large
meillo@61: .I "``range of usability''
meillo@23: ensures good adaptation, and thus good survival.
meillo@61: It is a special distinction when software becomes used in fields of endeavor,
meillo@61: the original authors never imagined.
meillo@13: Software that solves problems in a general way will likely be used
meillo@45: for many kinds of similar problems.
meillo@45: Being too specific limits the range of usability.
meillo@13: Requirements change through time, thus use cases change or even vanish.
meillo@61: As a good example of this point,
meillo@13: Allman identifies flexibility to be one major reason for sendmail's success:
meillo@13: .[
meillo@44: allman
meillo@44: sendmail
meillo@13: .]
meillo@13: .QP
meillo@13: Second, I limited myself to the routing function [...].
meillo@13: This was a departure from the dominant thought of the time, [...].
meillo@13: .QP
meillo@45: Third, the sendmail configuration file was flexible enough to adapt
meillo@13: to a rapidly changing world [...].
meillo@12: .LP
meillo@45: Successful software adapts itself to the changing world.
meillo@13: .PP
meillo@15: (4)
meillo@61: .I "``Reusability of parts''
meillo@61: goes one step further.
meillo@61: Software may become obsolete and completely lose its field of action,
meillo@61: but the constituent parts of the software may be general and independent enough
meillo@13: to survive this death.
meillo@54: If software is built by combining small independent programs,
meillo@45: then these parts are readily available for reuse.
meillo@61: Who cares that the large program is a failure,
meillo@61: if parts of it become successful instead?
meillo@10: 
meillo@48: .H 2 "Summary
meillo@0: .LP
meillo@61: This chapter explained ideas central to the Unix Philosophy.
meillo@45: For each of the ideas, the advantages they introduce were explained.
meillo@61: The Unix Philosophy is a set of guidelines that help in the design of
meillo@61: more valuable software.
meillo@61: From the viewpoint of a software developer or software designer,
meillo@61: the Unix Philosophy provides answers to many software design problems.
meillo@14: .PP
meillo@61: The various ideas comprising the Unix Philosophy are very interweaved
meillo@14: and can hardly be applied independently.
meillo@61: The most important messages are:
meillo@45: .I "``Keep it simple!''" ,
meillo@14: .I "``Do one thing well!''" ,
meillo@14: and
meillo@14: .I "``Use software leverage!''
meillo@0: 
meillo@8: 
meillo@8: 
meillo@48: .H 1 "Case study: \s-1MH\s0
meillo@18: .LP
meillo@30: The previous chapter introduced and explained the Unix Philosophy
meillo@18: from a general point of view.
meillo@61: The driving force was that of the guidelines;
meillo@61: references to existing software were given only sparsely.
meillo@18: In this and the next chapter, concrete software will be
meillo@18: the driving force in the discussion.
meillo@18: .PP
meillo@23: This first case study is about the mail user agents (\s-1MUA\s0)
meillo@54: \s-1MH\s0 (``mail handler'') and its descendant \fInmh\fP
meillo@23: (``new mail handler'').
meillo@47: .[
meillo@47: nmh website
meillo@47: .]
meillo@23: \s-1MUA\s0s provide functions to read, compose, and organize mail,
meillo@45: but (ideally) not to transfer it.
meillo@45: In this document, the name \s-1MH\s0 will be used to include nmh.
meillo@19: A distinction will only be made if differences between
meillo@45: \s-1MH\s0 and nmh are described.
meillo@18: 
meillo@0: 
meillo@48: .H 2 "Historical background
meillo@0: .LP
meillo@61: Electronic mail was available in Unix from a very early stage.
meillo@30: The first \s-1MUA\s0 on Unix was \f(CWmail\fP,
meillo@30: which was already present in the First Edition.
meillo@46: .[ [
meillo@44: salus
meillo@44: quarter century of unix
meillo@46: .], page 41 f.]
meillo@45: It was a small program that either printed the user's mailbox file
meillo@54: or appended text to someone else's mailbox file,
meillo@19: depending on the command line arguments.
meillo@19: .[
meillo@44: manual mail(1)
meillo@19: .]
meillo@19: It was a program that did one job well.
meillo@23: This job was emailing, which was very simple then.
meillo@19: .PP
meillo@23: Later, emailing became more powerful, and thus more complex.
meillo@19: The simple \f(CWmail\fP, which knew nothing of subjects,
meillo@19: independent handling of single messages,
meillo@61: and long-term email storage, was not powerful enough anymore.
meillo@61: In 1978 at Berkeley, Kurt Shoens wrote \fIMail\fP (with a capital `M')
meillo@45: to provide additional functions for emailing.
meillo@61: Mail was still one program, but was large and did several jobs.
meillo@61: Its user interface was modeled after \fIed\fP.
meillo@61: Ed is designed for humans, but is still scriptable.
meillo@61: \fImailx\fP is the adaptation of Berkeley Mail for System V.
meillo@19: .[
meillo@44: ritter
meillo@44: mailx history
meillo@19: .]
meillo@61: Elm, pine, mutt, and a slew of graphical \s-1MUA\s0s
meillo@61: followed Mail's direction:
meillo@61: large, monolithic programs which included all emailing functions.
meillo@19: .PP
meillo@23: A different way was taken by the people of \s-1RAND\s0 Corporation.
meillo@61: Initially, they also had used a monolithic mail system
meillo@30: called \s-1MS\s0 (for ``mail system'').
meillo@19: But in 1977, Stockton Gaines and Norman Shapiro
meillo@61: came up with a proposal for a new email system concept \(en
meillo@45: one that honored the Unix Philosophy.
meillo@19: The concept was implemented by Bruce Borden in 1978 and 1979.
meillo@19: This was the birth of \s-1MH\s0 \(en the ``mail handler''.
meillo@18: .PP
meillo@18: Since then, \s-1RAND\s0, the University of California at Irvine and
meillo@19: at Berkeley, and several others have contributed to the software.
meillo@18: However, it's core concepts remained the same.
meillo@23: In the late 90s, when development of \s-1MH\s0 slowed down,
meillo@19: Richard Coleman started with \fInmh\fP, the new mail handler.
meillo@61: His goal was to improve \s-1MH\s0 especially in regard to
meillo@23: the requirements of modern emailing.
meillo@19: Today, nmh is developed by various people on the Internet.
meillo@18: .[
meillo@44: ware
meillo@44: rand history
meillo@18: .]
meillo@18: .[
meillo@44: peek
meillo@44: mh
meillo@18: .]
meillo@0: 
meillo@48: .H 2 "Contrasts to monolithic mail systems
meillo@0: .LP
meillo@19: All \s-1MUA\s0s are monolithic, except \s-1MH\s0.
meillo@61: Although some very little known toolchest \s-1MUA\s0s might also exist,
meillo@61: this statement reflects the situation pretty well.
meillo@19: .PP
meillo@30: Monolithic \s-1MUA\s0s gather all their functions in one program.
meillo@30: In contrast, \s-1MH\s0 is a toolchest of many small tools \(en one for each job.
meillo@23: Following is a list of important programs of \s-1MH\s0's toolchest
meillo@30: and their function.
meillo@61: It gives an indication of what the toolchest looks like.
meillo@19: .IP \(bu
meillo@19: .CW inc :
meillo@30: incorporate new mail (this is how mail enters the system)
meillo@19: .IP \(bu
meillo@19: .CW scan :
meillo@19: list messages in folder
meillo@19: .IP \(bu
meillo@19: .CW show :
meillo@19: show message
meillo@19: .IP \(bu
meillo@19: .CW next\fR/\fPprev :
meillo@19: show next/previous message
meillo@19: .IP \(bu
meillo@19: .CW folder :
meillo@19: change current folder
meillo@19: .IP \(bu
meillo@19: .CW refile :
meillo@45: refile message into different folder
meillo@19: .IP \(bu
meillo@19: .CW rmm :
meillo@19: remove message
meillo@19: .IP \(bu
meillo@19: .CW comp : 
meillo@45: compose new message
meillo@19: .IP \(bu
meillo@19: .CW repl :
meillo@45: reply to message
meillo@19: .IP \(bu
meillo@19: .CW forw : 
meillo@45: forward message
meillo@19: .IP \(bu
meillo@19: .CW send : 
meillo@45: send prepared message (this is how mail leaves the system)
meillo@0: .LP
meillo@19: \s-1MH\s0 has no special user interface like monolithic \s-1MUA\s0s have.
meillo@61: The user does not leave the shell to run \s-1MH\s0;
meillo@45: instead he uses the various \s-1MH\s0 programs within the shell.
meillo@23: Using a monolithic program with a captive user interface
meillo@23: means ``entering'' the program, using it, and ``exiting'' the program.
meillo@23: Using toolchests like \s-1MH\s0 means running programs,
meillo@45: alone or in combination with others, also from other toolchests,
meillo@23: without leaving the shell.
meillo@30: 
meillo@48: .H 2 "Data storage
meillo@30: .LP
meillo@61: \s-1MH\s0's mail storage consists of a hierarchy under the user's
meillo@34: \s-1MH\s0 directory (usually \f(CW$HOME/Mail\fP),
meillo@34: where mail folders are directories and mail messages are text files
meillo@34: within them.
meillo@34: Each mail folder contains a file \f(CW.mh_sequences\fP which lists
meillo@45: the public message sequences of that folder,
meillo@61: for instance, the \fIunseen\fP sequence for new messages.
meillo@34: Mail messages are text files located in a mail folder.
meillo@61: The files contain the messages as they were received,
meillo@61: and they are named by ascending numbers in each folder.
meillo@19: .PP
meillo@30: This mailbox format is called ``\s-1MH\s0'' after the \s-1MUA\s0.
meillo@30: Alternatives are \fImbox\fP and \fImaildir\fP.
meillo@61: In the mbox format, all messages are stored within one file.
meillo@30: This was a good solution in the early days, when messages
meillo@61: were only a few lines of text deleted within a short period of time.
meillo@61: Today, with single messages often including several megabytes
meillo@61: of attachments, this is a bad solution.
meillo@30: Another disadvantage of the mbox format is that it is
meillo@30: more difficult to write tools that work on mail messages,
meillo@30: because it is always necessary to first find and extract
meillo@30: the relevant message in the mbox file.
meillo@45: With the \s-1MH\s0 mailbox format, each message is a separate file.
meillo@30: Also, the problem of concurrent access to one mailbox is
meillo@30: reduced to the problem of concurrent access to one message.
meillo@45: The maildir format is generally similar to the \s-1MH\s0 format,
meillo@30: but modified towards guaranteed reliability.
meillo@30: This involves some complexity, unfortunately.
meillo@34: .PP
meillo@34: Working with \s-1MH\s0's toolchest on mailboxes is much like
meillo@34: working with Unix' toolchest on directory trees:
meillo@34: \f(CWscan\fP is like \f(CWls\fP,
meillo@34: \f(CWshow\fP is like \f(CWcat\fP,
meillo@34: \f(CWfolder\fP is like \f(CWcd\fP and \f(CWpwd\fP,
meillo@34: \f(CWrefile\fP is like \f(CWmv\fP,
meillo@34: and \f(CWrmm\fP is like \f(CWrm\fP.
meillo@34: .PP
meillo@61: \s-1MH\s0 extends the context of processes in Unix by two more items
meillo@45: for its tools:
meillo@34: .IP \(bu
meillo@34: The current mail folder, which is similar to the current working directory.
meillo@34: For mail folders, \f(CWfolder\fP provides the corresponding functionality
meillo@34: of \f(CWcd\fP and \f(CWpwd\fP for directories.
meillo@34: .IP \(bu
meillo@34: Sequences, which are named sets of messages in a mail folder.
meillo@34: The current message, relative to a mail folder, is a special sequence.
meillo@34: It enables commands like \f(CWnext\fP and \f(CWprev\fP.
meillo@34: .LP
meillo@61: In contrast to the general process context in Unix,
meillo@61: which is maintained by the kernel,
meillo@45: \s-1MH\s0's context must be maintained by the tools themselves.
meillo@45: Usually there is one context per user, which resides in his
meillo@45: \f(CWcontext\fP file in the \s-1MH\s0 directory,
meillo@45: but a user can have several contexts, too.
meillo@45: Public sequences are an exception, as they belong to a mail folder,
meillo@45: and reside in the \f(CW.mh_sequences\fP file there.
meillo@34: .[
meillo@44: man page mh-profile mh-sequence
meillo@34: .]
meillo@20: 
meillo@48: .H 2 "Discussion of the design
meillo@0: .LP
meillo@45: This section discusses \s-1MH\s0 in regard to the tenets
meillo@45: of the Unix Philosophy that Gancarz identified.
meillo@20: 
meillo@20: .PP
meillo@33: .B "Small is beautiful
meillo@20: and
meillo@33: .B "do one thing well
meillo@20: are two design goals that are directly visible in \s-1MH\s0.
meillo@61: Gancarz actually uses \s-1MH\s0 in his book as example under the
meillo@45: headline ``Making \s-1UNIX\s0 Do One Thing Well'':
meillo@46: .[ [
meillo@44: gancarz
meillo@44: unix philosophy
meillo@46: .], page 125 ff.]
meillo@20: .QP
meillo@20: [\s-1MH\s0] consists of a series of programs which
meillo@20: when combined give the user an enormous ability
meillo@20: to manipulate electronic mail messages.
meillo@20: A complex application, it shows that not only is it
meillo@20: possible to build large applications from smaller
meillo@20: components, but also that such designs are actually preferable.
meillo@20: .LP
meillo@45: The various programs of \s-1MH\s0 were relatively easy to write,
meillo@61: because each one was small, limited to one function,
meillo@61: and had clear boundaries.
meillo@61: For the same reasons, they are also easy to maintain.
meillo@61: Further more, the system can easily get extended:
meillo@61: One only needs to place a new program into the toolchest.
meillo@61: This was done when \s-1MIME\s0 support was added
meillo@20: (e.g. \f(CWmhbuild\fP).
meillo@61: Also, different programs can exist to do basically the same job
meillo@20: in different ways (e.g. in nmh: \f(CWshow\fP and \f(CWmhshow\fP).
meillo@45: .PP
meillo@61: If someone needs a mail system with some additional
meillo@61: functionality that is not available anywhere yet,
meillo@61: it is beneficial to expand a toolchest system like \s-1MH\s0.
meillo@45: There he can add new functionality by simply adding additional
meillo@61: programs to the toolchest;
meillo@61: he does not risk to break existing functionality by doing so.
meillo@20: 
meillo@20: .PP
meillo@61: .B "Store data in flat text files" ;
meillo@61: this principle was followed by \s-1MH\s0.
meillo@34: This is not surprising, because email messages are already plain text.
meillo@34: \s-1MH\s0 stores the messages as it receives them,
meillo@61: thus any other tool that works on \s-1RFC\s0\|2822 compliant mail
meillo@61: messages can operate
meillo@34: on the messages in an \s-1MH\s0 mailbox.
meillo@61: All other files \s-1MH\s0 uses are plain text as well.
meillo@34: It is therefore possible and encouraged to use the text processing
meillo@34: tools of Unix' toolchest to extend \s-1MH\s0's toolchest.
meillo@20: 
meillo@20: .PP
meillo@33: .B "Avoid captive user interfaces" .
meillo@19: \s-1MH\s0 is perfectly suited for non-interactive use.
meillo@61: It offers all functions directly, without captive user interfaces.
meillo@61: If users want a graphical user interface,
meillo@53: they can have it with \fIxmh\fP, \fIexmh\fP,
meillo@53: or with the Emacs interface \fImh-e\fP.
meillo@53: These are frontends for the \s-1MH\s0 toolchest.
meillo@61: This means all email-related work is still done by \s-1MH\s0 tools,
meillo@45: but the frontend calls the appropriate commands when the user
meillo@53: clicks on buttons or pushes a key.
meillo@45: .PP
meillo@61: Providing additional user interfaces in form of frontends is a good
meillo@19: approach, because it does not limit the power of the backend itself.
meillo@61: The frontend will only be able to make a subset of the
meillo@61: backend's power and flexibility available to the user,
meillo@61: but if it is a separate program,
meillo@20: then the missing parts can still be accessed at the backend directly.
meillo@61: If it is integrated, then this will be much more difficult.
meillo@61: An additional advantage is the ability to have different frontends
meillo@45: to the same backend.
meillo@19: 
meillo@19: .PP
meillo@33: .B "Choose portability over efficiency
meillo@20: and
meillo@33: .B "use shell scripts to increase leverage and portability" .
meillo@20: These two tenets are indirectly, but nicely, demonstrated by
meillo@30: Bolsky and Korn in their book about the Korn Shell.
meillo@20: .[
meillo@44: bolsky korn
meillo@44: korn shell
meillo@20: .]
meillo@45: Chapter\|18 of the book shows a basic implementation
meillo@20: of a subset of \s-1MH\s0 in ksh scripts.
meillo@61: This is just a demonstration, but a brilliant one.
meillo@20: It shows how quickly one can implement such a prototype with shell scripts,
meillo@20: and how readable they are.
meillo@61: The implementation in scripting language may not be very fast,
meillo@61: but it can be fast enough, and this is all that matters.
meillo@20: By having the code in an interpreted language, like the shell,
meillo@61: portability becomes a minor issue if we assume the interpreter
meillo@20: to be widespread.
meillo@45: .PP
meillo@20: This demonstration also shows how easy it is to create single programs
meillo@61: of toolchest software.
meillo@61: Eight tools (two of them having multiple names) and 16 functions
meillo@45: with supporting code are presented to the reader.
meillo@45: The tools comprise less than 40 lines of ksh each,
meillo@30: in total about 200 lines.
meillo@45: The functions comprise less than 80 lines of ksh each,
meillo@30: in total about 450 lines.
meillo@20: Such small software is easy to write, easy to understand,
meillo@20: and thus easy to maintain.
meillo@61: A toolchest improves one's ability to only write some parts of a
meillo@61: program while still creating a working result.
meillo@45: Expanding the toolchest, even without global changes,
meillo@45: will likely be possible.
meillo@20: 
meillo@20: .PP
meillo@33: .B "Use software leverage to your advantage
meillo@20: and the lesser tenet
meillo@33: .B "allow the user to tailor the environment
meillo@20: are ideally followed in the design of \s-1MH\s0.
meillo@21: Tailoring the environment is heavily encouraged by the ability to
meillo@30: directly define default options to programs.
meillo@30: It is even possible to define different default options
meillo@45: depending on the name under which a program is called.
meillo@45: Software leverage is heavily encouraged by the ease of
meillo@45: creating shell scripts that run a specific command line,
meillo@30: built of several \s-1MH\s0 programs.
meillo@61: There are few pieces of software that encourages users to tailor their
meillo@61: environment and to leverage the use of the software like \s-1MH\s0.
meillo@45: .PP
meillo@61: Just to cite one example:
meillo@23: One might prefer a different listing format for the \f(CWscan\fP
meillo@21: program.
meillo@30: It is possible to take one of the distributed format files
meillo@21: or to write one yourself.
meillo@21: To use the format as default for \f(CWscan\fP, a single line,
meillo@21: reading
meillo@21: .DS
meillo@21: scan: -form FORMATFILE
meillo@21: .DE
meillo@21: must be added to \f(CW.mh_profile\fP.
meillo@61: If one wants this alternative format available as an additional command,
meillo@61: instead of changing the default, he just needs to create a link to
meillo@23: \f(CWscan\fP, for instance titled \f(CWscan2\fP.
meillo@21: The line in \f(CW.mh_profile\fP would then start with \f(CWscan2\fP,
meillo@61: as the option should only be in effect for a program that is invoked as
meillo@21: \f(CWscan2\fP.
meillo@20: 
meillo@20: .PP
meillo@33: .B "Make every program a filter
meillo@61: is hard to find implemented in \s-1MH\s0.
meillo@61: The reason is that most of \s-1MH\s0's tools provide
meillo@45: basic file system operations for mailboxes.
meillo@61: It is for the same reason because that \f(CWls\fP, \f(CWcp\fP, \f(CWmv\fP,
meillo@45: and \f(CWrm\fP aren't filters neither.
meillo@61: \s-1MH\s0 does not provide many filters itself,
meillo@61: but it provides a basis upon which to write filters.
meillo@45: An example would be a mail text highlighter,
meillo@61: a program that makes use of a color terminal to display header lines,
meillo@61: quotations, and signatures in distinct colors.
meillo@45: The author's version of such a program is an awk script with 25 lines.
meillo@21: 
meillo@21: .PP
meillo@33: .B "Build a prototype as soon as possible
meillo@21: was again well followed by \s-1MH\s0.
meillo@61: This tenet, of course, focuses on early development, which is a
meillo@21: long time ago for \s-1MH\s0.
meillo@21: But without following this guideline at the very beginning,
meillo@23: Bruce Borden may have not convinced the management of \s-1RAND\s0
meillo@23: to ever create \s-1MH\s0.
meillo@23: In Bruce' own words:
meillo@46: .[ [
meillo@44: ware rand history
meillo@46: .], page 132]
meillo@21: .QP
meillo@45: [...] but [Stockton Gaines and Norm Shapiro] were not able
meillo@23: to convince anyone that such a system would be fast enough to be usable.
meillo@21: I proposed a very short project to prove the basic concepts,
meillo@21: and my management agreed.
meillo@21: Looking back, I realize that I had been very lucky with my first design.
meillo@21: Without nearly enough design work,
meillo@21: I built a working environment and some header files
meillo@21: with key structures and wrote the first few \s-1MH\s0 commands:
meillo@21: inc, show/next/prev, and comp.
meillo@21: [...]
meillo@21: With these three, I was able to convince people that the structure was viable.
meillo@21: This took about three weeks.
meillo@0: 
meillo@48: .H 2 "Problems
meillo@0: .LP
meillo@61: \s-1MH\s0 is not without its problems.
meillo@61: There are two main problems: one is technical, the other pertains to human behavior.
meillo@22: .PP
meillo@61: \s-1MH\s0 is old and email today is quite different than it was in the time
meillo@22: when \s-1MH\s0 was designed.
meillo@61: \s-1MH\s0 adapted to the changes fairly well, but it has its limitations.
meillo@22: \s-1MIME\s0 support and support for different character encodings
meillo@22: is available, but only on a moderate level.
meillo@45: This comes from limited development resources.
meillo@61: A larger and more active developer base could quickly remedy this.
meillo@45: But \s-1MH\s0 is also limited by design, which is the larger problem.
meillo@54: \s-1IMAP\s0, for example, conflicts with \s-1MH\s0's design to a large extent.
meillo@61: These design conflicts are not easily solvable
meillo@61: and may require a redesign.
meillo@61: \s-1IMAP\s0 may be too incompatible with the classic mail model,
meillo@61: which \s-1MH\s0 covers, so \s-1MH\s0 may never support it well.
meillo@61: (Using \s-1IMAP\s0 and a filesystem abstraction layer to only map
meillo@61: a remote directory into the local filesystem, is a different topic.
meillo@61: \s-1IMAP\s0 support is seen as being able to access the special
meillo@61: mail features of the protocol.)
meillo@22: .PP
meillo@61: The other kind of problem relates to human habits.
meillo@45: In this world, where almost all \s-1MUA\s0s are monolithic,
meillo@61: it is very difficult to convince people to use a toolchest-style \s-1MUA\s0
meillo@22: like \s-1MH\s0.
meillo@61: These habits are so strong, that even people who understand the concept
meillo@61: and advantages of \s-1MH\s0 are reluctant to switch,
meillo@30: simply because \s-1MH\s0 is different.
meillo@61: Unfortunately, the frontends to \s-1MH\s0, which could provide familiar
meillo@61: look and feel, are quite outdated and thus not very appealing in comparison
meillo@61: to the modern interfaces of many monolithic \s-1MUA\s0s.
meillo@53: One notable exception is \fImh-e\fP which provides an Emacs interface
meillo@53: to \s-1MH\s0.
meillo@53: \fIMh-e\fP looks much like \fImutt\fP or \fIpine\fP, 
meillo@53: but it has buttons, menus, and graphical display capabilities.
meillo@20: 
meillo@53: .H 2 "Summary
meillo@20: .LP
meillo@45: \s-1MH\s0 is an \s-1MUA\s0 that follows the Unix Philosophy in its design.
meillo@61: It consists of a toolchest of small tools, each of which does one job well.
meillo@31: The toolchest approach offers great flexibility to the user.
meillo@45: It is possible to utilize the complete power of the Unix shell with \s-1MH\s0.
meillo@61: This makes \s-1MH\s0 a very powerful mail system,
meillo@61: and extending and customizing \s-1MH\s0 is easy and encouraged.
meillo@31: .PP
meillo@31: Apart from the user's perspective, \s-1MH\s0 is development-friendly.
meillo@31: Its overall design follows clear rules.
meillo@61: The single tools do only one job; thus they are easy to understand,
meillo@61: write, and maintain.
meillo@31: They are all independent and do not interfere with the others.
meillo@61: Automated testing of their function is a straightforward task.
meillo@31: .PP
meillo@61: It is sad, that \s-1MH\s0's dissimilarity to other \s-1MUA\s0s is its
meillo@61: largest problem, as this dissimilarity is also its largest advantage.
meillo@61: Unfortunately, most people's habits are stronger
meillo@61: than the attraction of the clear design and the power \s-1MH\s0 offers.
meillo@0: 
meillo@8: 
meillo@8: 
meillo@48: .H 1 "Case study: uzbl
meillo@32: .LP
meillo@61: The last chapter focused on the \s-1MUA\s0 \s-1MH\s0,
meillo@61: which is an old and established piece of software.
meillo@45: This chapter covers uzbl, a fresh new project.
meillo@45: Uzbl is a web browser that adheres to the Unix Philosophy.
meillo@45: Its name comes from the \fILolspeak\fP word for ``usable'';
meillo@61: both are pronounced in the same way.
meillo@0: 
meillo@48: .H 2 "Historical background
meillo@0: .LP
meillo@32: Uzbl was started by Dieter Plaetinck in April 2009.
meillo@61: The idea was born in a thread on the Arch Linux forums.
meillo@32: .[
meillo@44: arch linux forums
meillo@44: browser
meillo@32: .]
meillo@61: After some discussion about the failures of well-known web browsers,
meillo@61: Plaetinck (alias Dieter@be) came up with a rough proposal
meillo@61: of how a better web browser could look.
meillo@61: In response to another member who asked if Plaetinck would write this
meillo@61: program because it sounded fantastic, Plaetinck replied:
meillo@32: ``Maybe, if I find the time ;-)''.
meillo@32: .PP
meillo@32: Fortunately, he found the time.
meillo@32: One day later, the first prototype was out.
meillo@61: One week later, uzbl had its own website.
meillo@47: .[
meillo@47: uzbl website
meillo@47: .]
meillo@61: One month after the initial code was presented,
meillo@61: a mailing list was set up to coordinate and discuss further development,
meillo@61: and a wiki was added to store documentation and scripts that cropped up
meillo@61: on the mailing list and elsewhere.
meillo@32: .PP
meillo@61: In the first year of uzbl's existence, it was heavily developed on various branches.
meillo@32: Plaetinck's task became more and more to only merge the best code from the
meillo@32: different branches into his main branch, and to apply patches.
meillo@47: .[
meillo@47: lwn uzbl
meillo@47: .]
meillo@32: About once a month, Plaetinck released a new version.
meillo@32: In September 2009, he presented several forks of uzbl.
meillo@47: .[ [
meillo@47: uzbl website
meillo@47: .], news archive]
meillo@61: Uzbl actually opened the field for a whole family of web browsers with
meillo@61: a similar design.
meillo@32: .PP
meillo@61: In July 2009, \fILinux Weekly News\fP published an interview with
meillo@61: Plaetinck about uzbl.
meillo@47: .[
meillo@47: lwn uzbl
meillo@47: .]
meillo@32: In September 2009, the uzbl web browser was on \fISlashdot\fP.
meillo@47: .[
meillo@47: slashdot uzbl
meillo@47: .]
meillo@0: 
meillo@48: .H 2 "Contrasts to other web browsers
meillo@0: .LP
meillo@32: Like most \s-1MUA\s0s are monolithic, but \s-1MH\s0 is a toolchest,
meillo@32: most web browsers are monolithic, but uzbl is a frontend to a toolchest.
meillo@32: .PP
meillo@32: Today, uzbl is divided into uzbl-core and uzbl-browser.
meillo@61: Uzbl-core is, as its name indicates, the core of uzbl.
meillo@61: It handles commands and events to interface with other programs,
meillo@61: and displays webpages by using \fIwebkit\fP as its rendering engine.
meillo@61: Uzbl-browser combines uzbl-core with a selection of handler scripts,
meillo@61: a status bar, an event manager, yanking, pasting, page searching,
meillo@61: zooming, and much more functionality, to form a ``complete'' web browser.
meillo@61: In the following text, the term ``uzbl'' usually refers to uzbl-browser,
meillo@32: so uzbl-core is included.
meillo@32: .PP
meillo@61: Unlike most other web browsers, uzbl is mainly the mediator between
meillo@45: various tools that cover single jobs.
meillo@61: Uzbl listens for commands on a named pipe (fifo), a Unix socket,
meillo@35: and on stdin, and it writes events to a Unix socket and to stdout.
meillo@35: Loading a webpage in a running uzbl instance requires only:
meillo@32: .DS
meillo@32: echo 'uri http://example.org' >/path/to/uzbl-fifo
meillo@32: .DE
meillo@61: The rendering of the webpage is done by libwebkit,
meillo@61: around which uzbl-core is built.
meillo@32: .PP
meillo@45: Downloads, browsing history, bookmarks, and the like are not provided
meillo@61: by the core itself like they are in other web browsers.
meillo@61: Uzbl-browser also only provides ``handler scripts'' which wrap
meillo@61: external applications to provide the actual functionality.
meillo@32: For instance, \fIwget\fP is used to download files and uzbl-browser
meillo@32: includes a script that calls wget with appropriate options in
meillo@32: a prepared environment.
meillo@32: .PP
meillo@61: Modern web browsers are proud to have addons, plugins, modules,
meillo@61: and so forth.
meillo@32: This is their effort to achieve similar goals.
meillo@61: But instead of using existing external programs, modern web browsers
meillo@45: include these functions.
meillo@0: 
meillo@48: .H 2 "Discussion of the design
meillo@0: .LP
meillo@61: This section discusses uzbl in regard to the Unix Philosophy,
meillo@32: as identified by Gancarz.
meillo@32: 
meillo@32: .PP
meillo@35: .B "Make each program do one thing well" .
meillo@35: Uzbl tries to be a web browser and nothing else.
meillo@61: The common definition of a web browser is highly influenced by
meillo@61: existing implementations of web browsers.
meillo@61: But a web browser should be a program to browse the web, and nothing more.
meillo@61: This is the one thing it should do.
meillo@36: .PP
meillo@61: Web browsers should not, for instance, manage downloads;
meillo@61: this is the job of download managers.
meillo@61: A download manager is primary concerned with downloading files.
meillo@35: Modern web browsers provide download management only as a secondary feature.
meillo@61: How could they do this job better than programs that exist only for
meillo@35: this very job?
meillo@61: And why would anyone want less than the best download manager available?
meillo@32: .PP
meillo@35: A web browser's job is to let the user browse the web.
meillo@35: This means, navigating through websites by following links.
meillo@36: Rendering the \s-1HTML\s0 sources is a different job, too.
meillo@61: In uzbl's case, this is covered by the webkit rendering engine.
meillo@61: Handling audio and video content, PostScript, \s-1PDF\s0,
meillo@61: and other such files are also not the job of a web browser.
meillo@61: Such content should be handled by external programs
meillo@61: that were written to handle such data.
meillo@35: Uzbl strives to do it this way.
meillo@36: .PP
meillo@61: Remember Doug McIlroy's words:
meillo@35: .I
meillo@35: ``Write programs that do one thing and do it well.
meillo@35: Write programs to work together.''
meillo@35: .R
meillo@35: .PP
meillo@35: The lesser tenet
meillo@35: .B "allow the user to tailor the environment
meillo@61: applies here as well.
meillo@61: Previously, the question, ``Why would anyone want anything less than the
meillo@61: best program for the job?'' was put forward.
meillo@61: But as personal preferences matter, it might be more important to ask:
meillo@61: ``Why would anyone want something other than his preferred program for
meillo@61: the job?''
meillo@36: .PP
meillo@61: Users typically want one program for a specific job.
meillo@61: Hence, whenever one wishes to download something,
meillo@45: the same download manager should be used.
meillo@61: More advanced users might want to use one download manager in a certain
meillo@61: situation and another in a different situation;
meillo@61: they should be able to configure it this way.
meillo@61: With uzbl, any download manager can be used.
meillo@61: To switch to a different one, a single line in a small handler script
meillo@35: needs to be changed.
meillo@61: Alternatively, it would be possible to query which download manager to use by
meillo@61: reading a global file or an environment variable in the handler script.
meillo@61: Of course, uzbl can use a different handler script as well.
meillo@61: This simply requires a one line change in uzbl's configuration file.
meillo@36: .PP
meillo@61: Uzbl neither has its own download manager nor depends on a specific one;
meillo@61: hence, uzbl's browsing abilities will not be crippled by having
meillo@35: a bad download manager.
meillo@61: Uzbl's download capabilities will be as good as the best
meillo@36: download manager available on the system.
meillo@38: Of course, this applies to all of the other supplementary tools, too.
meillo@32: 
meillo@32: .PP
meillo@36: .B "Use software leverage to your advantage" .
meillo@36: Uzbl is designed to be extended by external tools.
meillo@36: These external tools are usually wrapped by small handler shell scripts.
meillo@61: Shell scripts form the basis for the glue which holds the various
meillo@61: parts together.
meillo@36: .PP
meillo@45: The history mechanism of uzbl shall be presented as an example.
meillo@36: Uzbl is configured to spawn a script to append an entry to the history
meillo@36: whenever the event of a fully loaded page occurs.
meillo@45: The script to append the entry to the history is not much more than:
meillo@36: .DS
meillo@36: #!/bin/sh
meillo@36: file=/path/to/uzbl-history
meillo@36: echo `date +'%Y-%m-%d %H:%M:%S'`" $6 $7" >> $file
meillo@36: .DE
meillo@61: \f(CW$6\fP and \f(CW$7\fP expand to the \s-1URL\s0 and the page title,
meillo@61: respectively.
meillo@45: .PP
meillo@45: For loading an entry, a key is bound to spawn a load-from-history script.
meillo@36: The script reverses the history to have newer entries first,
meillo@61: displays \fIdmenu\fP to let the user select an item,
meillo@61: and then writes the selected \s-1URL\s0 into uzbl's command input pipe.
meillo@45: With error checking and corner case handling removed,
meillo@45: the script looks like this:
meillo@36: .DS
meillo@36: #!/bin/sh
meillo@36: file=/path/to/uzbl-history
meillo@36: goto=`tac $file | dmenu | cut -d' ' -f 3`
meillo@36: echo "uri $goto" > $4
meillo@36: .DE
meillo@36: \f(CW$4\fP expands to the path of the command input pipe of the current
meillo@36: uzbl instance.
meillo@32: 
meillo@32: .PP
meillo@33: .B "Avoid captive user interfaces" .
meillo@61: One could say that uzbl, to a large extent, actually \fIis\fP
meillo@36: a captive user interface.
meillo@61: But the difference from other web browsers is that uzbl is only
meillo@45: the captive user interface frontend (and the core of the backend).
meillo@38: Many parts of the backend are independent of uzbl.
meillo@61: For some external programs, handler scripts are distributed with uzbl;
meillo@61: but arbitrary additional functionality can always be added if desired.
meillo@37: .PP
meillo@37: The frontend is captive \(en that is true.
meillo@37: This is okay for the task of browsing the web, as this task is only relevant
meillo@61: to humans.
meillo@61: Automated programs would \fIcrawl\fP the web, that means,
meillo@61: read the source directly, including all semantics.
meillo@61: The graphical representation is just for humans to understand the semantics
meillo@37: more intuitively.
meillo@32: 
meillo@32: .PP
meillo@33: .B "Make every program a filter" .
meillo@37: Graphical web browsers are almost dead ends in the chain of information flow.
meillo@37: Thus it is difficult to see what graphical web browsers should filter.
meillo@61: Graphical web browsers exist almost exclusively to be interactively used
meillo@61: by humans.
meillo@61: The only case in which one might want to automate the rendering function is
meillo@37: to generate images of rendered webpages.
meillo@37: 
meillo@37: .PP
meillo@37: .B "Small is beautiful"
meillo@61: is not easy to apply to a web browser because modern web technology
meillo@61: is very complex; hence, the rendering task is very complex.
meillo@61: Unfortunately, modern web browsers ``have'' to consist of many thousand
meillo@61: lines of code,
meillo@61: Using the toolchest approach and wrappers can help to split the browser
meillo@61: into several small parts, though.
meillo@37: .PP
meillo@45: As of March 2010, uzbl-core consists of about 3\,500 lines of C code.
meillo@37: The distribution includes another 3\,500 lines of Shell and Python code,
meillo@61: which are the handler scripts and plugins like one to provide a modal
meillo@61: interface.
meillo@61: Further more, uzbl makes use of external tools like
meillo@54: \fIwget\fP and \fIsocat\fP.
meillo@37: Up to this point, uzbl looks pretty neat and small.
meillo@61: The ugly part of uzbl is the rendering engine, webkit.
meillo@37: Webkit consists of roughly 400\,000 (!) lines of code.
meillo@61: Unfortunately, small rendering engines are not feasible anymore
meillo@61: due to the nature of the modern web.
meillo@35: 
meillo@35: .PP
meillo@35: .B "Build a prototype as soon as possible" .
meillo@61: Plaetinck made his code public right from the beginning.
meillo@61: Discussion and development was, and still is, open to everyone interested,
meillo@61: and development versions of uzbl can be obtained very easily from the
meillo@61: code repository.
meillo@38: Within the first year of uzbl's existence, a new version was released
meillo@35: more often than once a month.
meillo@61: Different forks and branches arose introducing new features which were
meillo@61: then considered for merging into the main branch.
meillo@61: The experiences with using prototypes influenced further development.
meillo@35: Actually, all development was community driven.
meillo@38: Plaetinck says, three months after uzbl's birth:
meillo@35: ``Right now I hardly code anything myself for Uzbl.
meillo@35: I just merge in other people's code, ponder a lot, and lead the discussions.''
meillo@35: .[
meillo@44: lwn
meillo@44: uzbl
meillo@35: .]
meillo@32: 
meillo@0: 
meillo@48: .H 2 "Problems
meillo@0: .LP
meillo@61: Similar to \s-1MH\s0, uzbl suffers from being different.
meillo@38: It is sad, but people use what they know.
meillo@61: Fortunately, uzbl's user interface can be made to look and feel very similar
meillo@61: to the one of the well known web browsers,
meillo@38: hiding the internal differences.
meillo@38: But uzbl has to provide this similar look and feel to be accepted
meillo@38: as a ``normal'' browser by ``normal'' users.
meillo@37: .PP
meillo@61: The more important problem here is the modern web.
meillo@38: The modern web is simply broken.
meillo@61: It has state in a state-less protocol, misuses technologies,
meillo@61: and is helplessly overloaded.
meillo@61: This results in rendering engines that ``must'' consist
meillo@61: of hundreds of thousands of lines of code.
meillo@61: They also must combine and integrate many different technologies
meillo@61: to make our modern web accessible.
meillo@61: This results, however, in a failing attempt to provide good usability.
meillo@61: Website-to-image converters are almost impossible to run without
meillo@38: human interaction because of state in sessions, impossible
meillo@61: deep-linking, and ``unautomatable'' technologies.
meillo@37: .PP
meillo@61: The web was misused in order to attempt to fulfill all kinds of wishes.
meillo@61: Now web browsers, and ultimately users, suffer from it.
meillo@37: 
meillo@8: 
meillo@51: .H 2 "Summary
meillo@32: .LP
meillo@38: ``Uzbl is a browser that adheres to the Unix Philosophy'',
meillo@61: is how uzbl is seen by its authors.
meillo@38: Indeed, uzbl follows the Unix Philosophy in many ways.
meillo@38: It consists of independent parts that work together,
meillo@45: while its core is mainly a mediator which glues the parts together.
meillo@38: .PP
meillo@61: Software leverage is put to excellent use.
meillo@61: External tools are used, independent tasks are separated out
meillo@61: to independent parts and glued together with small handler scripts.
meillo@38: .PP
meillo@61: Since uzbl roughly consists of a set of tools and a bit of glue,
meillo@61: anyone can put the parts together and expand it in any desired way.
meillo@61: Flexibility and customization are properties that make it valuable
meillo@61: for advanced users, but may keep novice users from understanding
meillo@61: and using it.
meillo@38: .PP
meillo@61: But uzbl's main problem is the modern web, which makes it very difficult
meillo@38: to design a sane web browser.
meillo@38: Despite this bad situation, uzbl does a fairly good job.
meillo@32: 
meillo@8: 
meillo@48: .H 1 "Final thoughts
meillo@0: 
meillo@0: .LP
meillo@50: This paper explained why good design is important.
meillo@61: It introduced the Unix Philosophy as a set of guidelines that encourage
meillo@61: good design in order to create good quality software.
meillo@61: Then, real world software that was designed with the Unix Philosophy
meillo@61: in mind was discussed.
meillo@50: .PP
meillo@61: Throughout this paper, the aim was do explain \fIwhy\fP something
meillo@50: should be done the Unix way.
meillo@61: Reasons were given to substantiate the claim that the Unix Philosophy
meillo@61: is a preferable way of designing software.
meillo@50: .PP
meillo@50: The Unix Philosophy is close to the software developer's point of view.
meillo@61: Its main goal is taming the beast known as ``software complexity''.
meillo@61: Hence it strives first and foremost for simplicity of software.
meillo@61: It might appear that usability for humans is a minor goal,
meillo@61: but actually, the Unix Philosophy sees usability as a result of sound design.
meillo@61: Sound design does not need to be ultimately intuitive,
meillo@50: but it will provide a consistent way to access the enormous power
meillo@50: of software leverage.
meillo@50: .PP
meillo@61: Being able to solve some specific concrete problem becomes less and less
meillo@61: important as there is software available for nearly every possible task
meillo@61: today.
meillo@50: But the quality of software matters.
meillo@50: It is important that we have \fIgood\fP software.
meillo@50: .sp
meillo@0: .LP
meillo@50: .B "But why the Unix Philosophy?
meillo@50: .PP
meillo@50: The largest problem of software development is the complexity involved.
meillo@50: It is the only part of the job that computers cannot take over.
meillo@61: The Unix Philosophy fights complexity, as it is the main enemy.
meillo@50: .PP
meillo@50: On the other hand,
meillo@61: the most unique advantage of software is its ability to leverage.
meillo@50: Current software still fails to make the best possible use of this ability.
meillo@61: The Unix Philosophy concentrates on exploiting this great opportunity.
meillo@0: 
meillo@47: 
meillo@47: .bp
meillo@47: .TL
meillo@47: References
meillo@47: .LP
meillo@47: .XS
meillo@47: .sp .5v
meillo@47: .B
meillo@47: References
meillo@47: .XE
meillo@47: .ev r
meillo@42: .nr PS -1
meillo@42: .nr VS -1
meillo@0: .[
meillo@0: $LIST$
meillo@0: .]
meillo@47: .nr PS +1
meillo@47: .nr VS +1
meillo@47: .ev
meillo@47: 
meillo@42: .bp
meillo@47: .TL
meillo@47: Table of Contents
meillo@47: .LP
meillo@47: .PX no