Mercurial > docs > cut
comparison cut.en.ms @ 29:c0b522e689bc
Some more minor rework based on Kate's comments
author | markus schnalke <meillo@marmaro.de> |
---|---|
date | Sat, 12 Sep 2015 12:18:52 +0200 |
parents | 0d7329867dd1 |
children | 6977e2ee5dc5 |
comparison
equal
deleted
inserted
replaced
28:0d7329867dd1 | 29:c0b522e689bc |
---|---|
106 .PP | 106 .PP |
107 The field mode is suited for simple tabulary data, like the | 107 The field mode is suited for simple tabulary data, like the |
108 passwd file. Beyond that, it soon reaches its limits. The typical | 108 passwd file. Beyond that, it soon reaches its limits. The typical |
109 case of whitespace-separated fields, in particular, is covered | 109 case of whitespace-separated fields, in particular, is covered |
110 poorly by it. Cut's delimiter is exactly one character, | 110 poorly by it. Cut's delimiter is exactly one character, |
111 therefore one may not split at both space and tab characters. | 111 therefore one can not split at both space and tab characters. |
112 Furthermore, multiple adjacent delimiter characters lead to | 112 Furthermore, multiple adjacent delimiter characters lead to |
113 empty fields. This is not the expected behavior for | 113 empty fields. This is not the expected behavior for |
114 the processing of whitespace-separated fields. Some | 114 the processing of whitespace-separated fields. Some |
115 implementations, e.g. the one of FreeBSD, have extensions that | 115 implementations, e.g. the one of FreeBSD, have extensions that |
116 handle this case in the expected way. Apart from that, i.e. | 116 handle this case in the expected way. On other systems or |
117 if one likes to stay portable, awk comes to rescue. | 117 to stay portable, awk comes to rescue. |
118 .PP | 118 .PP |
119 Awk provides another functionality that cut lacks: Changing the order | 119 Awk provides another functionality that cut lacks: Changing the order |
120 of the fields in the output. For cut, the order of the field | 120 of the fields in the output. For cut, the order of the field |
121 selection specification is irrelevant; it doesn't even matter if | 121 selection specification is irrelevant; it doesn't even matter if |
122 fields occur multiple times. Thus, the invocation | 122 fields occur multiple times. Thus, the invocation |
235 | 235 |
236 .SH | 236 .SH |
237 Multi-byte support | 237 Multi-byte support |
238 .LP | 238 .LP |
239 The byte mode and thus the multi-byte support of the POSIX | 239 The byte mode and thus the multi-byte support of the POSIX |
240 character mode have benn standardized since 1992. But | 240 character mode have been standardized since 1992. But are |
241 how about their presence in the available implementations? | 241 they present in the available implementations? Which versions |
242 Which versions implement POSIX correctly? | 242 implement POSIX correctly? |
243 .PP | 243 .PP |
244 The situation is divided into three parts: There are historic | 244 The situation is divided into three parts: There are historic |
245 implementations, which have only \f(CW-c\fP and \f(CW-f\fP. | 245 implementations, which have only \f(CW-c\fP and \f(CW-f\fP. |
246 Then there are implementations that have \f(CW-b\fP, but | 246 Then there are implementations that have \f(CW-b\fP, but |
247 treat it as an alias for \f(CW-c\fP only. These | 247 treat it as an alias for \f(CW-c\fP only. These |
250 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and | 250 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and |
251 \f(CW-n\fP is ignored). Finally, there are implementations | 251 \f(CW-n\fP is ignored). Finally, there are implementations |
252 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant | 252 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant |
253 way. | 253 way. |
254 .PP | 254 .PP |
255 Historic two-mode implementations are the ones of | 255 Historic two-mode implementations are the ones of |
256 System III, System V, and the BSD ones until the mid-90s. | 256 System III, System V, and the BSD ones from the beginning |
257 until the mid-90s. | |
257 .PP | 258 .PP |
258 Pseudo multi-byte implementations are provided by GNU, | 259 Pseudo multi-byte implementations are provided by GNU, |
259 modern NetBSD, and modern OpenBSD. The level of POSIX compliance | 260 modern NetBSD, and modern OpenBSD. The level of POSIX compliance |
260 that is presented there is often higher than the level of | 261 that is presented there is often higher than the level of |
261 compliance that is actually provided. Sometimes it takes a | 262 compliance that is actually provided. Sometimes it takes a |