comparison cut.en.ms @ 29:c0b522e689bc

Some more minor rework based on Kate's comments
author markus schnalke <meillo@marmaro.de>
date Sat, 12 Sep 2015 12:18:52 +0200
parents 0d7329867dd1
children 6977e2ee5dc5
comparison
equal deleted inserted replaced
28:0d7329867dd1 29:c0b522e689bc
106 .PP 106 .PP
107 The field mode is suited for simple tabulary data, like the 107 The field mode is suited for simple tabulary data, like the
108 passwd file. Beyond that, it soon reaches its limits. The typical 108 passwd file. Beyond that, it soon reaches its limits. The typical
109 case of whitespace-separated fields, in particular, is covered 109 case of whitespace-separated fields, in particular, is covered
110 poorly by it. Cut's delimiter is exactly one character, 110 poorly by it. Cut's delimiter is exactly one character,
111 therefore one may not split at both space and tab characters. 111 therefore one can not split at both space and tab characters.
112 Furthermore, multiple adjacent delimiter characters lead to 112 Furthermore, multiple adjacent delimiter characters lead to
113 empty fields. This is not the expected behavior for 113 empty fields. This is not the expected behavior for
114 the processing of whitespace-separated fields. Some 114 the processing of whitespace-separated fields. Some
115 implementations, e.g. the one of FreeBSD, have extensions that 115 implementations, e.g. the one of FreeBSD, have extensions that
116 handle this case in the expected way. Apart from that, i.e. 116 handle this case in the expected way. On other systems or
117 if one likes to stay portable, awk comes to rescue. 117 to stay portable, awk comes to rescue.
118 .PP 118 .PP
119 Awk provides another functionality that cut lacks: Changing the order 119 Awk provides another functionality that cut lacks: Changing the order
120 of the fields in the output. For cut, the order of the field 120 of the fields in the output. For cut, the order of the field
121 selection specification is irrelevant; it doesn't even matter if 121 selection specification is irrelevant; it doesn't even matter if
122 fields occur multiple times. Thus, the invocation 122 fields occur multiple times. Thus, the invocation
235 235
236 .SH 236 .SH
237 Multi-byte support 237 Multi-byte support
238 .LP 238 .LP
239 The byte mode and thus the multi-byte support of the POSIX 239 The byte mode and thus the multi-byte support of the POSIX
240 character mode have benn standardized since 1992. But 240 character mode have been standardized since 1992. But are
241 how about their presence in the available implementations? 241 they present in the available implementations? Which versions
242 Which versions implement POSIX correctly? 242 implement POSIX correctly?
243 .PP 243 .PP
244 The situation is divided into three parts: There are historic 244 The situation is divided into three parts: There are historic
245 implementations, which have only \f(CW-c\fP and \f(CW-f\fP. 245 implementations, which have only \f(CW-c\fP and \f(CW-f\fP.
246 Then there are implementations that have \f(CW-b\fP, but 246 Then there are implementations that have \f(CW-b\fP, but
247 treat it as an alias for \f(CW-c\fP only. These 247 treat it as an alias for \f(CW-c\fP only. These
250 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and 250 UTF-8) their \f(CW-c\fP behaves like \f(CW-b\fP (and
251 \f(CW-n\fP is ignored). Finally, there are implementations 251 \f(CW-n\fP is ignored). Finally, there are implementations
252 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant 252 that implement \f(CW-c\fP and \f(CW-b\fP in a POSIX-compliant
253 way. 253 way.
254 .PP 254 .PP
255 Historic two-mode implementations are the ones of 255 Historic two-mode implementations are the ones of
256 System III, System V, and the BSD ones until the mid-90s. 256 System III, System V, and the BSD ones from the beginning
257 until the mid-90s.
257 .PP 258 .PP
258 Pseudo multi-byte implementations are provided by GNU, 259 Pseudo multi-byte implementations are provided by GNU,
259 modern NetBSD, and modern OpenBSD. The level of POSIX compliance 260 modern NetBSD, and modern OpenBSD. The level of POSIX compliance
260 that is presented there is often higher than the level of 261 that is presented there is often higher than the level of
261 compliance that is actually provided. Sometimes it takes a 262 compliance that is actually provided. Sometimes it takes a