annotate cut.en.ms @ 31:106609b64dc4

minor corrections and improvements in the text
author markus schnalke <meillo@marmaro.de>
date Tue, 15 Sep 2015 17:20:20 +0200
parents 6977e2ee5dc5
children 5f78bcd34eeb
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
1 .so macros
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
2 .lc_ctype en_US.utf8
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
3 .pl -4v
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
4
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
5 .TL
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
6 Cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
7 .AU
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
8 markus schnalke <meillo@marmaro.de>
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
9 ..
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
10 .FS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
11 2015-05.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
12 This text is in the public domain (CC0).
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
13 It is available online:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
14 .I http://marmaro.de/docs/
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
15 .FE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
16
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
17 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
18 Cut is a classic program in the Unix toolchest.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
19 It is present in most tutorials on shell programming, because it
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
20 is such a nice and useful tool with good explanatory value.
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
21 This text shall take a look underneath its surface.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
22 .SH
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
23 Usage
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
24 .LP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
25 Initially, cut had two operation modes, which were later amended
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
26 by a third: The cut program may cut specified characters or bytes
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
27 out of the input lines or it may cut out specified fields, which
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
28 are defined by a delimiting character.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
29 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
30 The character mode is well suited to slice fixed-width input
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
31 formats into parts. One might, for instance, extract the access
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
32 rights from the output of \f(CWls -l\fP, as shown here with the
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
33 rights of a file's owner:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
34 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
35 $ ls -l foo
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
36 -rw-rw-r-- 1 meillo users 0 May 12 07:32 foo
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
37 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
38 $ ls -l foo | cut -c 2-4
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
39 rw-
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
40 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
41 .LP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
42 Or the write permission for the owner, the group, and the
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
43 world:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
44 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
45 $ ls -l foo | cut -c 3,6,9
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
46 ww-
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
47 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
48 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
49 Cut can also be used to shorten strings:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
50 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
51 $ long=12345678901234567890
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
52 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
53 $ echo "$long" | cut -c -10
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
54 1234567890
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
55 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
56 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
57 This command outputs no more than the first 10 characters of
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
58 \f(CW$long\fP. (Alternatively, on could use \f(CWprintf
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
59 "%.10s\\n" "$long"\fP for this task.)
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
60 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
61 However, if it's not about displaying characters, but rather about
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
62 storing them, then \f(CW-c\fP is only partly suited. In former times,
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
63 when US-ASCII was the omnipresent character encoding, each
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
64 character was stored as exactly one byte. Therefore, \f(CWcut
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
65 -c\fP selected both output characters and bytes equally. With
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
66 the uprise of multi-byte encodings (like UTF-8), this assumption
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
67 became obsolete. Consequently, a byte mode (option \f(CW-b\fP)
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
68 was added to cut, with POSIX.2-1992. To select up to 500 bytes
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
69 from the beginning of each line (and ignore the rest), one can use:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
70 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
71 $ cut -b -500
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
72 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
73 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
74 The remainder can be caught with \f(CWcut -b 501-\fP. This
30
6977e2ee5dc5 Another minor text change
markus schnalke <meillo@marmaro.de>
parents: 29
diff changeset
75 use of cut is important for POSIX, because it provides a
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
76 transformation of text files with arbitrary line lenghts to text
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
77 files with limited line length
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
78 .[[ http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_17 .
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
79 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
80 The introduction of the new byte mode essentially held the same
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
81 functionality as the old character mode. The character mode,
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
82 however, required a new, different implementation. In consequence,
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
83 the problem was not the support of the byte mode, but rather the
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
84 correct support of the new character mode.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
85 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
86 Besides the character and byte modes, cut also offers a field
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
87 mode, which is activated by \f(CW-f\fP. It selects fields from
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
88 the input. The field-delimiter character for the input as well
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
89 as for the output (by default the tab) may be changed using
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
90 \f(CW-d\fP.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
91 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
92 The typical example for the use of cut's field mode is the
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
93 selection of information from the password file. Here, for
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
94 instance, the usernames and their uids:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
95 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
96 $ cut -d: -f1,3 /etc/passwd
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
97 root:0
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
98 bin:1
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
99 daemon:2
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
100 mail:8
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
101 ...
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
102 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
103 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
104 (The values to the command line switches may be appended directly
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
105 to them or separated by whitespace.)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
106 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
107 The field mode is suited for simple tabulary data, like the
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
108 password file. Beyond that, it soon reaches its limits. The typical
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
109 case of whitespace-separated fields, in particular, is covered
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
110 poorly by it. Cut's delimiter is exactly one character,
29
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
111 therefore one can not split at both space and tab characters.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
112 Furthermore, multiple adjacent delimiter characters lead to
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
113 empty fields. This is not the expected behavior for
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
114 the processing of whitespace-separated fields. Some
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
115 implementations, e.g. the one of FreeBSD, have extensions that
29
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
116 handle this case in the expected way. On other systems or
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
117 to stay portable, awk comes to rescue.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
118 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
119 Awk provides another functionality that cut lacks: Changing the order
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
120 of the fields in the output. For cut, the order of the field
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
121 selection specification is irrelevant; it doesn't even matter if
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
122 fields occur multiple times. Thus, the invocation
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
123 \f(CWcut -c 5-8,1,4-6\fP outputs the characters number
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
124 1, 4, 5, 6, 7, and 8 in exactly this order. The
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
125 selection specification resembles mathematical set theory: Each
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
126 specified field is part of the solution set. The fields in the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
127 solution set are always in the same order as in the input. To
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
128 speak with the words of the man page in Version 8 Unix:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
129 ``In data base parlance, it projects a relation.''
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
130 .[[ http://man.cat-v.org/unix_8th/1/cut
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
131 This means that cut applies the \fIprojection\fP database operation
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
132 to the text input. Wikipedia explains it in the following way:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
133 ``In practical terms, it can be roughly thought of as picking a
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
134 sub-set of all available columns.''
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
135 .[[ https://en.wikipedia.org/wiki/Projection_(relational_algebra)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
136
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
137 .SH
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
138 Historical Background
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
139 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
140 Cut came to public life in 1982 with the release of UNIX System
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
141 III. Browsing through the sources of System III, one finds cut.c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
142 with the timestamp 1980-04-11
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
143 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/cmd .
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
144 This is the oldest implementation of the program I was able to
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
145 discover. However, the SCCS-ID in the source code contains the
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
146 version number 1.5. According to Doug McIlroy
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
147 .[[ http://minnie.tuhs.org/pipermail/tuhs/2015-May/004083.html ,
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
148 the earlier history likely lies in PWB/UNIX, which was the
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
149 basis for System III. In the available sources of PWB 1.0 (1977)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
150 .[[ http://minnie.tuhs.org/Archive/PDP-11/Distributions/usdl/ ,
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
151 no cut is present. Of PWB 2.0, no sources or useful documentation
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
152 seem to be available. PWB 3.0 was later renamed to System III
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
153 for marketing purposes only; it is otherwise identical to it. A
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
154 branch of PWB was CB UNIX, which was only used in the Bell Labs
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
155 internally. The manual of CB UNIX Edition 2.1 of November 1979
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
156 contains the earliest mention of cut that my research brought
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
157 to light, in the form of a man page
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
158 .[[ ftp://sunsite.icm.edu.pl/pub/unix/UnixArchive/PDP-11/Distributions/other/CB_Unix/cbunix_man1_02.pdf .
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
159 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
160 A look at BSD: There, my earliest discovery is a cut.c with
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
161 the file modification date of 1986-11-07
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
162 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-UWisc/src/usr.bin/cut
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
163 as part of the special version 4.3BSD-UWisc
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
164 .[[ http://gunkies.org/wiki/4.3_BSD_NFS_Wisconsin_Unix ,
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
165 which was released in January 1987.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
166 This implementation is mostly identical to the one in System
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
167 III. The better known 4.3BSD-Tahoe (1988) does not contain cut.
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
168 The subsequent 4.3BSD-Reno (1990) does include cut. It is a freshly
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
169 written one by Adam S. Moskowitz and Marciano Pitargue, which was
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
170 included in BSD in 1989
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
171 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/usr.bin/cut .
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
172 Its man page
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
173 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/usr.bin/cut/cut.1
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
174 already mentions the expected compliance to POSIX.2.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
175 One should note that POSIX.2 was first published in
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
176 September 1992, about two years after the man page and the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
177 program were written. Hence, the program must have been
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
178 implemented based on a draft version of the standard. A look into
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
179 the code confirms the assumption. The function to parse the field
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
180 selection includes the following comment:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
181 .QP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
182 This parser is less restrictive than the Draft 9 POSIX spec.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
183 POSIX doesn't allow lists that aren't in increasing order or
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
184 overlapping lists.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
185 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
186 Draft 11.2 of POSIX (1991-09) requires this flexibility already:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
187 .QP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
188 The elements in list can be repeated, can overlap, and can
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
189 be specified in any order.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
190 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
191 The same draft additionally includes all three operation modes,
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
192 whereas this early BSD cut only implemented the original two.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
193 Draft 9 might not have included the byte mode. Without access to
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
194 Draft 9 or 10, it wasn't possible to verify this guess.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
195 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
196 The version numbers and change dates of the older BSD
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
197 implementations are manifested in the SCCS-IDs, which the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
198 version control system of that time inserted. For instance
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
199 in 4.3BSD-Reno: ``5.3 (Berkeley) 6/24/90''.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
200 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
201 The cut implementation of the GNU coreutils contains the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
202 following copyright notice:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
203 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
204 Copyright (C) 1997-2015 Free Software Foundation, Inc.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
205 Copyright (C) 1984 David M. Ihnat
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
206 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
207 .LP
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
208 This code does have old origins. Further comments show that
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
209 the source code was reworked by David MacKenzie first and later
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
210 by Jim Meyering, who put it into the version control system in
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
211 1992. It is unclear why the years until 1997, at least from
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
212 1992 onwards, don't show up in the copyright notice.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
213 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
214 Despite all those year numbers from the 80s, cut is a rather
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
215 young tool, at least in relation to the early Unix. Despite
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
216 being a decade older than Linux (the kernel), Unix was present
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
217 for over ten years already by the time cut appeared for the first
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
218 time. Most notably, cut wasn't part of Version 7 Unix, which
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
219 became the basis for all modern Unix systems. The more complex
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
220 tools sed and awk were part of it already. Hence, the
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
221 question comes to mind why cut was written at all, as two
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
222 programs already existed that were able to cover its use
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
223 cases. One reason for cut surely was its compactness and the
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
224 resulting speed, in comparison to the then-bulky awk. This lean
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
225 shape goes well with the Unix philosopy: Do one job and do it
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
226 well! Cut was sufficiently convincing. It found its way to
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
227 other Unix variants, it became standardized, and today it is
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
228 present everywhere.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
229 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
230 The original variant (without \f(CW-b\fP) was described already
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
231 in 1985, by the System V Interface Definition, an important
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
232 formal description of UNIX System V. In the following years, it
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
233 appeared in all relevant standards. POSIX.2 specified cut for
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
234 the first time in its modern form (with \f(CW-b\fP) in 1992.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
235
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
236 .SH
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
237 Multi-byte support
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
238 .LP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
239 The byte mode and thus the multi-byte support of the POSIX
29
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
240 character mode have been standardized since 1992. But are
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
241 they present in the available implementations? Which versions
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
242 implement POSIX correctly?
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
243 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
244 The situation is divided into three parts: There are historic
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
245 implementations, which have only \f(CW-c\fP and \f(CW-f\fP.
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
246 Then there are implementations that have \f(CW-b\fP, but
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
247 treat it as an alias for \f(CW-c\fP only. These
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
248 implementations work correctly for single-byte encodings
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
249 (e.g. US-ASCII, Latin1) but for multi-byte encodings (e.g.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
250 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
251 \f(CW-n\fP is ignored). Finally, there are implementations
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
252 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
253 way.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
254 .PP
29
c0b522e689bc Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents: 28
diff changeset
255 Historic two-mode implementations are the ones of
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
256 System III, System V, and the BSD ones until the mid-90s.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
257 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
258 Pseudo multi-byte implementations are provided by GNU,
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
259 modern NetBSD, and modern OpenBSD. The level of POSIX compliance
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
260 that is presented there is often higher than the level of
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
261 compliance that is actually provided. Sometimes it takes a
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
262 close look to discover that \f(CW-c\fP and \f(CW-n\fP don't
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
263 behave as expected. Some of the implementations take the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
264 easy way by simply being ignorant to any multi-byte
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
265 encodings, at least they declare that clearly:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
266 .QP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
267 Since we don't support multi-byte characters, the \f(CW-c\fP
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
268 and \f(CW-b\fP options are equivalent, and the \f(CW-n\fP
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
269 option is meaningless.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
270 .[[ http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/cut/cut.c?rev=1.18&content-type=text/x-cvsweb-markup
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
271 .LP
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
272 Standard-adhering implementations, i.e. ones that treat
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
273 multi-byte characters correctly, are those of the modern
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
274 FreeBSD and the Heirloom toolchest. Tim Robbins
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
275 reimplemented the character mode of FreeBSD cut,
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
276 conforming to POSIX, in the summer of 2004
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
277 .[[ https://svnweb.freebsd.org/base?view=revision&revision=131194 .
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
278 The question why the other BSD systems have not
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
279 integrated this change is an open one. Maybe the answer an be
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
280 found in the above quoted statement.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
281 .PP
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
282 How do users find out if the cut on their own system handles
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
283 multi-byte characters correctly? First, one needs to check if
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
284 the system itself uses multi-byte characters, because otherwise
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
285 characters and bytes are equivalent and the question
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
286 is irrelevant. One can check this by looking at the locale
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
287 settings, but it is easier to print a typical multi-byte
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
288 character, for instance an Umlaut or the Euro currency
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
289 symbol, and check if one or more bytes are generated as
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
290 output:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
291 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
292 $ echo ä | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
293 0000000 303 244 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
294 0000003
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
295 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
296 .LP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
297 In this case it resulted in two bytes: octal 303 and 244. (The
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
298 newline character is added by echo.)
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
299 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
300 The program iconv converts text to specific encodings. This
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
301 is the output for Latin1 and UTF-8, for comparison:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
302 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
303 $ echo ä | iconv -t latin1 | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
304 0000000 344 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
305 0000002
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
306 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
307 $ echo ä | iconv -t utf8 | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
308 0000000 303 244 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
309 0000003
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
310 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
311 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
312 The output (without the iconv conversion) on many European
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
313 systems equals one of these two.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
314 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
315 Now for the test of the cut implementation. On a UTF-8 system, a
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
316 POSIX-compliant implementation behaves as such:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
317 .CS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
318 $ echo ä | cut -c 1 | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
319 0000000 303 244 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
320 0000003
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
321 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
322 $ echo ä | cut -b 1 | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
323 0000000 303 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
324 0000002
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
325 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
326 $ echo ä | cut -b 1 -n | od -c
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
327 0000000 \\n
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
328 0000001
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
329 .CE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
330 .LP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
331 A pseudo-POSIX implementation, in contrast, behaves like the
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
332 middle one for all three invocations: Only the first byte is
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
333 printed as output.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
334
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
335 .SH
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
336 Implementations
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
337 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
338 Let's take a look at the sources of a selection of
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
339 implementations.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
340 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
341 A comparison of the amount of source code is good to get a first
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
342 impression. Typically, it grows through time. This can generally
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
343 be seen here, but not in all cases. A POSIX-compliant
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
344 implementation of the character mode requires more code, thus
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
345 these implementations tend to be the larger ones.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
346 .TS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
347 center;
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
348 r r r l l l.
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
349 SLOC Lines Bytes Belongs to File time Category
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
350 _
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
351 116 123 2966 System III 1980-04-11 historic
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
352 118 125 3038 4.3BSD-UWisc 1986-11-07 historic
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
353 200 256 5715 4.3BSD-Reno 1990-06-25 historic
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
354 200 270 6545 NetBSD 1993-03-21 historic
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
355 218 290 6892 OpenBSD 2008-06-27 pseudo-POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
356 224 296 6920 FreeBSD 1994-05-27 historic
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
357 232 306 7500 NetBSD 2014-02-03 pseudo-POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
358 340 405 7423 Heirloom 2012-05-20 POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
359 382 586 14175 GNU coreutils 1992-11-08 pseudo-POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
360 391 479 10961 FreeBSD 2012-11-24 POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
361 588 830 23167 GNU coreutils 2015-05-01 pseudo-POSIX
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
362 .TE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
363 .LP
31
106609b64dc4 minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents: 30
diff changeset
364 There are four rough groups: (1) The two original
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
365 implementations, which are mostly identical, with about 100
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
366 SLOC. (2) The five BSD versions, with about 200 SLOC. (3) The
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
367 two POSIX-compliant versions and the old GNU one, with a SLOC
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
368 count in the 300s. And finally, (4) the modern GNU cut with
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
369 almost 600 SLOC.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
370 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
371 The variation between the number of logical code
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
372 lines (SLOC, measured with SLOCcount) and the number of
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
373 newlines in the file (\f(CWwc -l\fP) spans between factor
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
374 1.06 for the oldest versions and factor 1.5 for GNU. The
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
375 largest influence on it are empty lines, pure comment lines,
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
376 and the size of the license block at the beginning of the file.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
377 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
378 Regarding the variation between logical code lines and the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
379 file size (\f(CWwc -c\fP), the implementations span between
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
380 25 and 30 bytes per statement. With only 21 bytes per
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
381 statement, the Heirloom implementation marks the lower end;
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
382 the GNU implementation sets the upper limit at nearly 40 bytes. In
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
383 the case of GNU, the reason is mainly their coding style, with
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
384 special indentation rules and long identifiers. Whether one finds
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
385 the Heirloom implementation
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
386 .[[ http://heirloom.cvs.sourceforge.net/viewvc/heirloom/heirloom/cut/cut.c?revision=1.6&view=markup
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
387 highly cryptic or exceptionally elegant shall be left
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
388 to the judgement of the reader. Especially the
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
389 comparison to the GNU implementation
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
390 .[[ http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/cut.c;hb=e981643
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
391 is impressive.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
392 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
393 The internal structure of the source code (in all cases it is
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
394 written in C) is mainly similar. Besides the mandatory main
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
395 function, which does the command line argument processing,
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
396 there usually is a function to convert the field
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
397 selection specification to an internal data structure.
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
398 Furthermore, almost all implementations have separate
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
399 functions for each of their operation modes. The POSIX-compliant
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
400 versions treat the \f(CW-b -n\fP combination as a separate
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
401 mode and thus implement it in a separate function. Only the early
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
402 System III implementation (and its 4.3BSD-UWisc variant) do
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
403 everything, apart from error handling, in the main function.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
404 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
405 Implementations of cut typically have two limiting aspects:
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
406 One being the maximum number of fields that can be handled,
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
407 the other being the maximum line length. On System III, both
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
408 numbers are limited to 512. 4.3BSD-Reno and the BSDs of the
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
409 90s have fixed limits as well (\f(CW_BSD_LINE_MAX\fP or
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
410 \f(CW_POSIX2_LINE_MAX\fP). Modern FreeBSD, modern NetBSD, all GNU
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
411 implementations, and the Heirloom cut are able to handle
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
412 arbitrary numbers of fields and line lengths \(en the memory
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
413 is allocated dynamically. OpenBSD cut is a hybrid: It has a fixed
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
414 maximum number of fields, but allows arbitrary line lengths.
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
415 The limited number of fields does not, however, appear to be
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
416 any practical problem, because \f(CW_POSIX2_LINE_MAX\fP is
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
417 guaranteed to be at least 2048 and is thus probably large enough.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
418
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
419 .SH
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
420 Descriptions
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
421 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
422 Interesting, as well, is a comparison of the short descriptions
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
423 of cut, as can be found in the headlines of the man
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
424 pages or at the beginning of the source code files.
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
425 The following list is roughly grouped by origin:
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
426 .TS
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
427 center;
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
428 l l.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
429 CB UNIX cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
430 System III cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
431 System III \(dg cut and paste columns of a table (projection of a relation)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
432 System V cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
433 HP-UX cut out (extract) selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
434 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
435 4.3BSD-UWisc \(dg cut and paste columns of a table (projection of a relation)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
436 4.3BSD-Reno select portions of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
437 NetBSD select portions of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
438 OpenBSD 4.6 select portions of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
439 FreeBSD 1.0 select portions of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
440 FreeBSD 10.0 cut out selected portions of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
441 SunOS 4.1.3 remove selected fields from each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
442 SunOS 5.5.1 cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
443 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
444 Heirloom Tools cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
445 Heirloom Tools \(dg cut out fields of lines of files
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
446 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
447 GNU coreutils remove sections from each line of files
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
448 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
449 Minix select out columns of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
450 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
451 Version 8 Unix rearrange columns of data
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
452 ``Unix Reader'' rearrange columns of text
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
453 .sp .3
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
454 POSIX cut out selected fields of each line of a file
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
455 .TE
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
456 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
457 (The descriptions that are marked with `\(dg' were taken from
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
458 source code files. The POSIX entry contains the description
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
459 used in the standard. The ``Unix Reader'' is a retrospective
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
460 document by Doug McIlroy, which lists the availability of
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
461 tools in the Research Unix versions
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
462 .[[ http://doc.cat-v.org/unix/unix-reader/contents.pdf .
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
463 Its description should actually match the one in Version 8
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
464 Unix. The change could be a transfer mistake or a correction.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
465 All other descriptions originate from the various man pages.)
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
466 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
467 Over time, the POSIX description was often adopted or it
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
468 served as inspiration. One such example is FreeBSD
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
469 .[[ https://svnweb.freebsd.org/base?view=revision&revision=167101 .
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
470 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
471 It is noteworthy that the GNU coreutils in all versions
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
472 describe the performed action as a removal of parts of the
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
473 input, although the user clearly selects the parts that then
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
474 consistute the output. Probably the words ``cut out'' are too
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
475 misleading. HP-UX tried to be more clear.
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
476 .PP
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
477 Different terms are also used for the part being
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
478 selected. Some talk about fields (POSIX), some talk
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
479 about portions (BSD) and some call it columns (Research
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
480 Unix).
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
481 .PP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
482 The seemingly least adequate description, the one of Version
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
483 8 Unix (``rearrange columns of data'') is explainable in so
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
484 far that the man page covers both cut and paste, and in
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
485 their combination, columns can be rearranged. The use of
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
486 ``data'' instead of ``text'' might be a lapse, which McIlroy
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
487 corrected in his Unix Reader ... but on the other hand, on
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
488 Unix, the two words are mostly synonymous, because all data
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
489 is text.
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
490
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
491
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
492 .SH
28
0d7329867dd1 Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents: 27
diff changeset
493 References
27
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
494 .LP
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
495 .nf
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
496 ._r
5cefcfc72d42 Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
497