Mercurial > docs > cut
comparison cut.en.ms @ 33:a1589fcfe9f4
spell-checking plus a clarification thanks to Francesc
author | markus schnalke <meillo@marmaro.de> |
---|---|
date | Fri, 02 Oct 2015 07:01:20 +0200 |
parents | 5f78bcd34eeb |
children | 04a3cdadc50c |
comparison
equal
deleted
inserted
replaced
32:5f78bcd34eeb | 33:a1589fcfe9f4 |
---|---|
71 $ cut -b -500 | 71 $ cut -b -500 |
72 .CE | 72 .CE |
73 .LP | 73 .LP |
74 The remainder can be caught with \f(CWcut -b 501-\fP. This | 74 The remainder can be caught with \f(CWcut -b 501-\fP. This |
75 use of cut is important for POSIX, because it provides a | 75 use of cut is important for POSIX, because it provides a |
76 transformation of text files with arbitrary line lenghts to text | 76 transformation of text files with arbitrary line lengths to text |
77 files with limited line length | 77 files with limited line length |
78 .[[ http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_17 . | 78 .[[ http://pubs.opengroup.org/onlinepubs/9699919799/utilities/cut.html#tag_20_28_17 . |
79 .PP | 79 .PP |
80 The introduction of the new byte mode essentially held the same | 80 The introduction of the new byte mode essentially held the same |
81 functionality as the old character mode. The character mode, | 81 functionality as the old character mode. The character mode, |
102 .CE | 102 .CE |
103 .LP | 103 .LP |
104 (The values to the command line switches may be appended directly | 104 (The values to the command line switches may be appended directly |
105 to them or separated by whitespace.) | 105 to them or separated by whitespace.) |
106 .PP | 106 .PP |
107 The field mode is suited for simple tabulary data, like the | 107 The field mode is suited for simple tabular data, like the |
108 password file. Beyond that, it soon reaches its limits. The typical | 108 password file. Beyond that, it soon reaches its limits. The typical |
109 case of whitespace-separated fields, in particular, is covered | 109 case of whitespace-separated fields, in particular, is covered |
110 poorly by it. Cut's delimiter is exactly one character, | 110 poorly by it. Cut's delimiter is exactly one character, |
111 therefore one can not split at both space and tab characters. | 111 therefore one can not split at both space and tab characters. |
112 Furthermore, multiple adjacent delimiter characters lead to | 112 Furthermore, multiple adjacent delimiter characters lead to |
220 tools sed and awk were part of it already. Hence, the | 220 tools sed and awk were part of it already. Hence, the |
221 question comes to mind why cut was written at all, as two | 221 question comes to mind why cut was written at all, as two |
222 programs already existed that were able to cover its use | 222 programs already existed that were able to cover its use |
223 cases. One reason for cut surely was its compactness and the | 223 cases. One reason for cut surely was its compactness and the |
224 resulting speed, in comparison to the then-bulky awk. This lean | 224 resulting speed, in comparison to the then-bulky awk. This lean |
225 shape goes well with the Unix philosopy: Do one job and do it | 225 shape goes well with the Unix philosophy: Do one job and do it |
226 well! Cut was sufficiently convincing. It found its way to | 226 well! Cut was sufficiently convincing. It found its way to |
227 other Unix variants, it became standardized, and today it is | 227 other Unix variants, it became standardized, and today it is |
228 present everywhere. | 228 present everywhere. |
229 .PP | 229 .PP |
230 The original variant (without \f(CW-b\fP) was described already | 230 The original variant (without \f(CW-b\fP) was described already |
274 FreeBSD and the Heirloom toolchest. Tim Robbins | 274 FreeBSD and the Heirloom toolchest. Tim Robbins |
275 reimplemented the character mode of FreeBSD cut, | 275 reimplemented the character mode of FreeBSD cut, |
276 conforming to POSIX, in the summer of 2004 | 276 conforming to POSIX, in the summer of 2004 |
277 .[[ https://svnweb.freebsd.org/base?view=revision&revision=131194 . | 277 .[[ https://svnweb.freebsd.org/base?view=revision&revision=131194 . |
278 The question why the other BSD systems have not | 278 The question why the other BSD systems have not |
279 integrated this change is an open one. Maybe the answer can be | 279 integrated this change is an open one. Maybe the answer is |
280 found in the above quoted statement. | 280 a general ignorance of internationalization. |
281 .PP | 281 .PP |
282 How do users find out if the cut on their own system handles | 282 How do users find out if the cut on their own system handles |
283 multi-byte characters correctly? First, one needs to check if | 283 multi-byte characters correctly? First, one needs to check if |
284 the system itself uses multi-byte characters, because otherwise | 284 the system itself uses multi-byte characters, because otherwise |
285 characters and bytes are equivalent and the question | 285 characters and bytes are equivalent and the question |