Mercurial > docs > cut
annotate cut.en.ms @ 37:c338b706447b
fix spelling
author | markus schnalke <meillo@marmaro.de> |
---|---|
date | Mon, 05 Oct 2015 06:48:17 +0200 |
parents | 04a3cdadc50c |
children | ec76f8926598 |
rev | line source |
---|---|
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
1 .so macros |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
2 .lc_ctype en_US.utf8 |
34
04a3cdadc50c
improved hyphenation and pagination
markus schnalke <meillo@marmaro.de>
parents:
33
diff
changeset
|
3 .pl -3v |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
4 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
5 .TL |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
6 Cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
7 .AU |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
8 markus schnalke <meillo@marmaro.de> |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
9 .. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
10 .FS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
11 2015-05. |
34
04a3cdadc50c
improved hyphenation and pagination
markus schnalke <meillo@marmaro.de>
parents:
33
diff
changeset
|
12 This text is part of the public domain (CC0). |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
13 It is available online: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
14 .I http://marmaro.de/docs/ |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
15 .FE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
16 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
17 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
18 Cut is a classic program in the Unix toolchest. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
19 It is present in most tutorials on shell programming, because it |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
20 is such a nice and useful tool with good explanatory value. |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
21 This text shall take a look underneath its surface. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
22 .SH |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
23 Usage |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
24 .LP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
25 Initially, cut had two operation modes, which were later amended |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
26 by a third: The cut program may cut specified characters or bytes |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
27 out of the input lines or it may cut out specified fields, which |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
28 are defined by a delimiting character. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
29 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
30 The character mode is well suited to slice fixed-width input |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
31 formats into parts. One might, for instance, extract the access |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
32 rights from the output of \f(CWls -l\fP, as shown here with the |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
33 rights of a file's owner: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
34 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
35 $ ls -l foo |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
36 -rw-rw-r-- 1 meillo users 0 May 12 07:32 foo |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
37 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
38 $ ls -l foo | cut -c 2-4 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
39 rw- |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
40 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
41 .LP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
42 Or the write permission for the owner, the group, and the |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
43 world: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
44 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
45 $ ls -l foo | cut -c 3,6,9 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
46 ww- |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
47 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
48 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
49 Cut can also be used to shorten strings: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
50 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
51 $ long=12345678901234567890 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
52 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
53 $ echo "$long" | cut -c -10 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
54 1234567890 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
55 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
56 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
57 This command outputs no more than the first 10 characters of |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
58 \f(CW$long\fP. (Alternatively, on could use \f(CWprintf |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
59 "%.10s\\n" "$long"\fP for this task.) |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
60 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
61 However, if it's not about displaying characters, but rather about |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
62 storing them, then \f(CW-c\fP is only partly suited. In former times, |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
63 when US-ASCII was the omnipresent character encoding, each |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
64 character was stored as exactly one byte. Therefore, \f(CWcut |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
65 -c\fP selected both output characters and bytes equally. With |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
66 the uprise of multi-byte encodings (like UTF-8), this assumption |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
67 became obsolete. Consequently, a byte mode (option \f(CW-b\fP) |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
68 was added to cut, with POSIX.2-1992. To select up to 500 bytes |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
69 from the beginning of each line (and ignore the rest), one can use: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
70 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
71 $ cut -b -500 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
72 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
73 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
74 The remainder can be caught with \f(CWcut -b 501-\fP. This |
30
6977e2ee5dc5
Another minor text change
markus schnalke <meillo@marmaro.de>
parents:
29
diff
changeset
|
75 use of cut is important for POSIX, because it provides a |
33
a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
markus schnalke <meillo@marmaro.de>
parents:
32
diff
changeset
|
76 transformation of text files with arbitrary line lengths to text |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
77 files with limited line length |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
78 .[[ http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_17 . |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
79 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
80 The introduction of the new byte mode essentially held the same |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
81 functionality as the old character mode. The character mode, |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
82 however, required a new, different implementation. In consequence, |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
83 the problem was not the support of the byte mode, but rather the |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
84 correct support of the new character mode. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
85 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
86 Besides the character and byte modes, cut also offers a field |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
87 mode, which is activated by \f(CW-f\fP. It selects fields from |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
88 the input. The field-delimiter character for the input as well |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
89 as for the output (by default the tab) may be changed using |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
90 \f(CW-d\fP. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
91 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
92 The typical example for the use of cut's field mode is the |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
93 selection of information from the password file. Here, for |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
94 instance, the usernames and their uids: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
95 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
96 $ cut -d: -f1,3 /etc/passwd |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
97 root:0 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
98 bin:1 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
99 daemon:2 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
100 mail:8 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
101 ... |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
102 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
103 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
104 (The values to the command line switches may be appended directly |
34
04a3cdadc50c
improved hyphenation and pagination
markus schnalke <meillo@marmaro.de>
parents:
33
diff
changeset
|
105 to them or separated by white\%space.) |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
106 .PP |
33
a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
markus schnalke <meillo@marmaro.de>
parents:
32
diff
changeset
|
107 The field mode is suited for simple tabular data, like the |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
108 password file. Beyond that, it soon reaches its limits. The typical |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
109 case of whitespace-separated fields, in particular, is covered |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
110 poorly by it. Cut's delimiter is exactly one character, |
29
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
111 therefore one can not split at both space and tab characters. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
112 Furthermore, multiple adjacent delimiter characters lead to |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
113 empty fields. This is not the expected behavior for |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
114 the processing of whitespace-separated fields. Some |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
115 implementations, e.g. the one of FreeBSD, have extensions that |
29
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
116 handle this case in the expected way. On other systems or |
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
117 to stay portable, awk comes to rescue. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
118 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
119 Awk provides another functionality that cut lacks: Changing the order |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
120 of the fields in the output. For cut, the order of the field |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
121 selection specification is irrelevant; it doesn't even matter if |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
122 fields occur multiple times. Thus, the invocation |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
123 \f(CWcut -c 5-8,1,4-6\fP outputs the characters number |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
124 1, 4, 5, 6, 7, and 8 in exactly this order. The |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
125 selection specification resembles mathematical set theory: Each |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
126 specified field is part of the solution set. The fields in the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
127 solution set are always in the same order as in the input. To |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
128 speak with the words of the man page in Version 8 Unix: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
129 ``In data base parlance, it projects a relation.'' |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
130 .[[ http://man.cat-v.org/unix_8th/1/cut |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
131 This means that cut applies the \fIprojection\fP database operation |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
132 to the text input. Wikipedia explains it in the following way: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
133 ``In practical terms, it can be roughly thought of as picking a |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
134 sub-set of all available columns.'' |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
135 .[[ https://en.wikipedia.org/wiki/Projection_(relational_algebra) |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
136 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
137 .SH |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
138 Historical Background |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
139 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
140 Cut came to public life in 1982 with the release of UNIX System |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
141 III. Browsing through the sources of System III, one finds cut.c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
142 with the timestamp 1980-04-11 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
143 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=SysIII/usr/src/cmd . |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
144 This is the oldest implementation of the program I was able to |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
145 discover. However, the SCCS-ID in the source code contains the |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
146 version number 1.5. According to Doug McIlroy |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
147 .[[ http://minnie.tuhs.org/pipermail/tuhs/2015-May/004083.html , |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
148 the earlier history likely lies in PWB/UNIX, which was the |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
149 basis for System III. In the available sources of PWB 1.0 (1977) |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
150 .[[ http://minnie.tuhs.org/Archive/PDP-11/Distributions/usdl/ , |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
151 no cut is present. Of PWB 2.0, no sources or useful documentation |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
152 seem to be available. PWB 3.0 was later renamed to System III |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
153 for marketing purposes only; it is otherwise identical to it. A |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
154 branch of PWB was CB UNIX, which was only used in the Bell Labs |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
155 internally. The manual of CB UNIX Edition 2.1 of November 1979 |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
156 contains the earliest mention of cut that my research brought |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
157 to light, in the form of a man page |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
158 .[[ ftp://sunsite.icm.edu.pl/pub/unix/UnixArchive/PDP-11/Distributions/other/CB_Unix/cbunix_man1_02.pdf . |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
159 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
160 A look at BSD: There, my earliest discovery is a cut.c with |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
161 the file modification date of 1986-11-07 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
162 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-UWisc/src/usr.bin/cut |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
163 as part of the special version 4.3BSD-UWisc |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
164 .[[ http://gunkies.org/wiki/4.3_BSD_NFS_Wisconsin_Unix , |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
165 which was released in January 1987. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
166 This implementation is mostly identical to the one in System |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
167 III. The better known 4.3BSD-Tahoe (1988) does not contain cut. |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
168 The subsequent 4.3BSD-Reno (1990) does include cut. It is a freshly |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
169 written one by Adam S. Moskowitz and Marciano Pitargue, which was |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
170 included in BSD in 1989 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
171 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/usr.bin/cut . |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
172 Its man page |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
173 .[[ http://minnie.tuhs.org/cgi-bin/utree.pl?file=4.3BSD-Reno/src/usr.bin/cut/cut.1 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
174 already mentions the expected compliance to POSIX.2. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
175 One should note that POSIX.2 was first published in |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
176 September 1992, about two years after the man page and the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
177 program were written. Hence, the program must have been |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
178 implemented based on a draft version of the standard. A look into |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
179 the code confirms the assumption. The function to parse the field |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
180 selection includes the following comment: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
181 .QP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
182 This parser is less restrictive than the Draft 9 POSIX spec. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
183 POSIX doesn't allow lists that aren't in increasing order or |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
184 overlapping lists. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
185 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
186 Draft 11.2 of POSIX (1991-09) requires this flexibility already: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
187 .QP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
188 The elements in list can be repeated, can overlap, and can |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
189 be specified in any order. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
190 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
191 The same draft additionally includes all three operation modes, |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
192 whereas this early BSD cut only implemented the original two. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
193 Draft 9 might not have included the byte mode. Without access to |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
194 Draft 9 or 10, it wasn't possible to verify this guess. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
195 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
196 The version numbers and change dates of the older BSD |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
197 implementations are manifested in the SCCS-IDs, which the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
198 version control system of that time inserted. For instance |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
199 in 4.3BSD-Reno: ``5.3 (Berkeley) 6/24/90''. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
200 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
201 The cut implementation of the GNU coreutils contains the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
202 following copyright notice: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
203 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
204 Copyright (C) 1997-2015 Free Software Foundation, Inc. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
205 Copyright (C) 1984 David M. Ihnat |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
206 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
207 .LP |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
208 This code does have old origins. Further comments show that |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
209 the source code was reworked by David MacKenzie first and later |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
210 by Jim Meyering, who put it into the version control system in |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
211 1992. It is unclear why the years until 1997, at least from |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
212 1992 onwards, don't show up in the copyright notice. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
213 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
214 Despite all those year numbers from the 80s, cut is a rather |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
215 young tool, at least in relation to the early Unix. Despite |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
216 being a decade older than Linux (the kernel), Unix was present |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
217 for over ten years already by the time cut appeared for the first |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
218 time. Most notably, cut wasn't part of Version 7 Unix, which |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
219 became the basis for all modern Unix systems. The more complex |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
220 tools sed and awk were part of it already. Hence, the |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
221 question comes to mind why cut was written at all, as two |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
222 programs already existed that were able to cover its use |
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
223 cases. One reason for cut surely was its compactness and the |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
224 resulting speed, in comparison to the then-bulky awk. This lean |
33
a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
markus schnalke <meillo@marmaro.de>
parents:
32
diff
changeset
|
225 shape goes well with the Unix philosophy: Do one job and do it |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
226 well! Cut was sufficiently convincing. It found its way to |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
227 other Unix variants, it became standardized, and today it is |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
228 present everywhere. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
229 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
230 The original variant (without \f(CW-b\fP) was described already |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
231 in 1985, by the System V Interface Definition, an important |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
232 formal description of UNIX System V. In the following years, it |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
233 appeared in all relevant standards. POSIX.2 specified cut for |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
234 the first time in its modern form (with \f(CW-b\fP) in 1992. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
235 |
34
04a3cdadc50c
improved hyphenation and pagination
markus schnalke <meillo@marmaro.de>
parents:
33
diff
changeset
|
236 .pl -1v |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
237 .SH |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
238 Multi-byte support |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
239 .LP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
240 The byte mode and thus the multi-byte support of the POSIX |
29
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
241 character mode have been standardized since 1992. But are |
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
242 they present in the available implementations? Which versions |
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
243 implement POSIX correctly? |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
244 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
245 The situation is divided into three parts: There are historic |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
246 implementations, which have only \f(CW-c\fP and \f(CW-f\fP. |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
247 Then there are implementations that have \f(CW-b\fP, but |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
248 treat it as an alias for \f(CW-c\fP only. These |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
249 implementations work correctly for single-byte encodings |
34
04a3cdadc50c
improved hyphenation and pagination
markus schnalke <meillo@marmaro.de>
parents:
33
diff
changeset
|
250 (e.g. US-ASCII, Latin1) but for multi-byte en\%codings (e.g. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
251 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
252 \f(CW-n\fP is ignored). Finally, there are implementations |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
253 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
254 way. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
255 .PP |
29
c0b522e689bc
Some more minor rework based on Kate's comments
markus schnalke <meillo@marmaro.de>
parents:
28
diff
changeset
|
256 Historic two-mode implementations are the ones of |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
257 System III, System V, and the BSD ones until the mid-90s. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
258 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
259 Pseudo multi-byte implementations are provided by GNU, |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
260 modern NetBSD, and modern OpenBSD. The level of POSIX compliance |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
261 that is presented there is often higher than the level of |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
262 compliance that is actually provided. Sometimes it takes a |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
263 close look to discover that \f(CW-c\fP and \f(CW-n\fP don't |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
264 behave as expected. Some of the implementations take the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
265 easy way by simply being ignorant to any multi-byte |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
266 encodings, at least they declare that clearly: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
267 .QP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
268 Since we don't support multi-byte characters, the \f(CW-c\fP |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
269 and \f(CW-b\fP options are equivalent, and the \f(CW-n\fP |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
270 option is meaningless. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
271 .[[ http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/cut/cut.c?rev=1.18&content-type=text/x-cvsweb-markup |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
272 .LP |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
273 Standard-adhering implementations, i.e. ones that treat |
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
274 multi-byte characters correctly, are those of the modern |
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
275 FreeBSD and the Heirloom toolchest. Tim Robbins |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
276 reimplemented the character mode of FreeBSD cut, |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
277 conforming to POSIX, in the summer of 2004 |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
278 .[[ https://svnweb.freebsd.org/base?view=revision&revision=131194 . |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
279 The question why the other BSD systems have not |
33
a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
markus schnalke <meillo@marmaro.de>
parents:
32
diff
changeset
|
280 integrated this change is an open one. Maybe the answer is |
a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
markus schnalke <meillo@marmaro.de>
parents:
32
diff
changeset
|
281 a general ignorance of internationalization. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
282 .PP |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
283 How do users find out if the cut on their own system handles |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
284 multi-byte characters correctly? First, one needs to check if |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
285 the system itself uses multi-byte characters, because otherwise |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
286 characters and bytes are equivalent and the question |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
287 is irrelevant. One can check this by looking at the locale |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
288 settings, but it is easier to print a typical multi-byte |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
289 character, for instance an Umlaut or the Euro currency |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
290 symbol, and check if one or more bytes are generated as |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
291 output: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
292 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
293 $ echo ä | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
294 0000000 303 244 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
295 0000003 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
296 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
297 .LP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
298 In this case it resulted in two bytes: octal 303 and 244. (The |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
299 newline character is added by echo.) |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
300 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
301 The program iconv converts text to specific encodings. This |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
302 is the output for Latin1 and UTF-8, for comparison: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
303 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
304 $ echo ä | iconv -t latin1 | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
305 0000000 344 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
306 0000002 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
307 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
308 $ echo ä | iconv -t utf8 | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
309 0000000 303 244 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
310 0000003 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
311 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
312 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
313 The output (without the iconv conversion) on many European |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
314 systems equals one of these two. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
315 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
316 Now for the test of the cut implementation. On a UTF-8 system, a |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
317 POSIX-compliant implementation behaves as such: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
318 .CS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
319 $ echo ä | cut -c 1 | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
320 0000000 303 244 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
321 0000003 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
322 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
323 $ echo ä | cut -b 1 | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
324 0000000 303 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
325 0000002 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
326 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
327 $ echo ä | cut -b 1 -n | od -c |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
328 0000000 \\n |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
329 0000001 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
330 .CE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
331 .LP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
332 A pseudo-POSIX implementation, in contrast, behaves like the |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
333 middle one for all three invocations: Only the first byte is |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
334 printed as output. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
335 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
336 .SH |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
337 Implementations |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
338 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
339 Let's take a look at the sources of a selection of |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
340 implementations. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
341 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
342 A comparison of the amount of source code is good to get a first |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
343 impression. Typically, it grows through time. This can generally |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
344 be seen here, but not in all cases. A POSIX-compliant |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
345 implementation of the character mode requires more code, thus |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
346 these implementations tend to be the larger ones. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
347 .TS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
348 center; |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
349 r r r l l l. |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
350 SLOC Lines Bytes Belongs to File time Category |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
351 _ |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
352 116 123 2966 System III 1980-04-11 historic |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
353 118 125 3038 4.3BSD-UWisc 1986-11-07 historic |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
354 200 256 5715 4.3BSD-Reno 1990-06-25 historic |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
355 200 270 6545 NetBSD 1993-03-21 historic |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
356 218 290 6892 OpenBSD 2008-06-27 pseudo-POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
357 224 296 6920 FreeBSD 1994-05-27 historic |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
358 232 306 7500 NetBSD 2014-02-03 pseudo-POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
359 340 405 7423 Heirloom 2012-05-20 POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
360 382 586 14175 GNU coreutils 1992-11-08 pseudo-POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
361 391 479 10961 FreeBSD 2012-11-24 POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
362 588 830 23167 GNU coreutils 2015-05-01 pseudo-POSIX |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
363 .TE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
364 .LP |
31
106609b64dc4
minor corrections and improvements in the text
markus schnalke <meillo@marmaro.de>
parents:
30
diff
changeset
|
365 There are four rough groups: (1) The two original |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
366 implementations, which are mostly identical, with about 100 |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
367 SLOC. (2) The five BSD versions, with about 200 SLOC. (3) The |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
368 two POSIX-compliant versions and the old GNU one, with a SLOC |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
369 count in the 300s. And finally, (4) the modern GNU cut with |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
370 almost 600 SLOC. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
371 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
372 The variation between the number of logical code |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
373 lines (SLOC, measured with SLOCcount) and the number of |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
374 newlines in the file (\f(CWwc -l\fP) spans between factor |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
375 1.06 for the oldest versions and factor 1.5 for GNU. The |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
376 largest influence on it are empty lines, pure comment lines, |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
377 and the size of the license block at the beginning of the file. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
378 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
379 Regarding the variation between logical code lines and the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
380 file size (\f(CWwc -c\fP), the implementations span between |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
381 25 and 30 bytes per statement. With only 21 bytes per |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
382 statement, the Heirloom implementation marks the lower end; |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
383 the GNU implementation sets the upper limit at nearly 40 bytes. In |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
384 the case of GNU, the reason is mainly their coding style, with |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
385 special indentation rules and long identifiers. Whether one finds |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
386 the Heirloom implementation |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
387 .[[ http://heirloom.cvs.sourceforge.net/viewvc/heirloom/heirloom/cut/cut.c?revision=1.6&view=markup |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
388 highly cryptic or exceptionally elegant shall be left |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
389 to the judgement of the reader. Especially the |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
390 comparison to the GNU implementation |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
391 .[[ http://git.savannah.gnu.org/gitweb/?p=coreutils.git;a=blob;f=src/cut.c;hb=e981643 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
392 is impressive. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
393 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
394 The internal structure of the source code (in all cases it is |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
395 written in C) is mainly similar. Besides the mandatory main |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
396 function, which does the command line argument processing, |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
397 there usually is a function to convert the field |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
398 selection specification to an internal data structure. |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
399 Furthermore, almost all implementations have separate |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
400 functions for each of their operation modes. The POSIX-compliant |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
401 versions treat the \f(CW-b -n\fP combination as a separate |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
402 mode and thus implement it in a separate function. Only the early |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
403 System III implementation (and its 4.3BSD-UWisc variant) do |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
404 everything, apart from error handling, in the main function. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
405 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
406 Implementations of cut typically have two limiting aspects: |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
407 One being the maximum number of fields that can be handled, |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
408 the other being the maximum line length. On System III, both |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
409 numbers are limited to 512. 4.3BSD-Reno and the BSDs of the |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
410 90s have fixed limits as well (\f(CW_BSD_LINE_MAX\fP or |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
411 \f(CW_POSIX2_LINE_MAX\fP). Modern FreeBSD, modern NetBSD, all GNU |
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
412 implementations, and the Heirloom cut are able to handle |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
413 arbitrary numbers of fields and line lengths \(en the memory |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
414 is allocated dynamically. OpenBSD cut is a hybrid: It has a fixed |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
415 maximum number of fields, but allows arbitrary line lengths. |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
416 The limited number of fields does not, however, appear to be |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
417 any practical problem, because \f(CW_POSIX2_LINE_MAX\fP is |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
418 guaranteed to be at least 2048 and is thus probably large enough. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
419 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
420 .SH |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
421 Descriptions |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
422 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
423 Interesting, as well, is a comparison of the short descriptions |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
424 of cut, as can be found in the headlines of the man |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
425 pages or at the beginning of the source code files. |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
426 The following list is roughly grouped by origin: |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
427 .TS |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
428 center; |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
429 l l. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
430 CB UNIX cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
431 System III cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
432 System III \(dg cut and paste columns of a table (projection of a relation) |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
433 System V cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
434 HP-UX cut out (extract) selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
435 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
436 4.3BSD-UWisc \(dg cut and paste columns of a table (projection of a relation) |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
437 4.3BSD-Reno select portions of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
438 NetBSD select portions of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
439 OpenBSD 4.6 select portions of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
440 FreeBSD 1.0 select portions of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
441 FreeBSD 10.0 cut out selected portions of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
442 SunOS 4.1.3 remove selected fields from each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
443 SunOS 5.5.1 cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
444 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
445 Heirloom Tools cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
446 Heirloom Tools \(dg cut out fields of lines of files |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
447 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
448 GNU coreutils remove sections from each line of files |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
449 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
450 Minix select out columns of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
451 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
452 Version 8 Unix rearrange columns of data |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
453 ``Unix Reader'' rearrange columns of text |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
454 .sp .3 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
455 POSIX cut out selected fields of each line of a file |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
456 .TE |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
457 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
458 (The descriptions that are marked with `\(dg' were taken from |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
459 source code files. The POSIX entry contains the description |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
460 used in the standard. The ``Unix Reader'' is a retrospective |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
461 document by Doug McIlroy, which lists the availability of |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
462 tools in the Research Unix versions |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
463 .[[ http://doc.cat-v.org/unix/unix-reader/contents.pdf . |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
464 Its description should actually match the one in Version 8 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
465 Unix. The change could be a transfer mistake or a correction. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
466 All other descriptions originate from the various man pages.) |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
467 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
468 Over time, the POSIX description was often adopted or it |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
469 served as inspiration. One such example is FreeBSD |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
470 .[[ https://svnweb.freebsd.org/base?view=revision&revision=167101 . |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
471 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
472 It is noteworthy that the GNU coreutils in all versions |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
473 describe the performed action as a removal of parts of the |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
474 input, although the user clearly selects the parts that then |
37 | 475 constitute the output. Probably the words ``cut out'' are too |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
476 misleading. HP-UX tried to be more clear. |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
477 .PP |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
478 Different terms are also used for the part being |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
479 selected. Some talk about fields (POSIX), some talk |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
480 about portions (BSD) and some call it columns (Research |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
481 Unix). |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
482 .PP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
483 The seemingly least adequate description, the one of Version |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
484 8 Unix (``rearrange columns of data'') is explainable in so |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
485 far that the man page covers both cut and paste, and in |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
486 their combination, columns can be rearranged. The use of |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
487 ``data'' instead of ``text'' might be a lapse, which McIlroy |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
488 corrected in his Unix Reader ... but on the other hand, on |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
489 Unix, the two words are mostly synonymous, because all data |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
490 is text. |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
491 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
492 |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
493 .SH |
28
0d7329867dd1
Applied most of the corrections by Kate
markus schnalke <meillo@marmaro.de>
parents:
27
diff
changeset
|
494 References |
27
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
495 .LP |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
496 .nf |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
497 ._r |
5cefcfc72d42
Added first version of the translation to English
markus schnalke <meillo@marmaro.de>
parents:
diff
changeset
|
498 |