# HG changeset patch # User markus schnalke # Date 1442330420 -7200 # Node ID 106609b64dc4f522fe12fd3847c9725a874e2b1c # Parent 6977e2ee5dc503cd7deea3c5e282620fbb9c9368 minor corrections and improvements in the text diff -r 6977e2ee5dc5 -r 106609b64dc4 cut.en.ms --- a/cut.en.ms Sat Sep 12 13:13:21 2015 +0200 +++ b/cut.en.ms Tue Sep 15 17:20:20 2015 +0200 @@ -90,7 +90,7 @@ \f(CW-d\fP. .PP The typical example for the use of cut's field mode is the -selection of information from the passwd file. Here, for +selection of information from the password file. Here, for instance, the usernames and their uids: .CS $ cut -d: -f1,3 /etc/passwd @@ -105,7 +105,7 @@ to them or separated by whitespace.) .PP The field mode is suited for simple tabulary data, like the -passwd file. Beyond that, it soon reaches its limits. The typical +password file. Beyond that, it soon reaches its limits. The typical case of whitespace-separated fields, in particular, is covered poorly by it. Cut's delimiter is exactly one character, therefore one can not split at both space and tab characters. @@ -205,7 +205,7 @@ Copyright (C) 1984 David M. Ihnat .CE .LP -The code does have old origins. Further comments show that +This code does have old origins. Further comments show that the source code was reworked by David MacKenzie first and later by Jim Meyering, who put it into the version control system in 1992. It is unclear why the years until 1997, at least from @@ -214,13 +214,13 @@ Despite all those year numbers from the 80s, cut is a rather young tool, at least in relation to the early Unix. Despite being a decade older than Linux (the kernel), Unix was present -for over ten years by the time cut appeared for the first +for over ten years already by the time cut appeared for the first time. Most notably, cut wasn't part of Version 7 Unix, which became the basis for all modern Unix systems. The more complex tools sed and awk were part of it already. Hence, the question comes to mind why cut was written at all, as two -programs already existed that were able to cover the use cases of -cut. One reason for cut surely was its compactness and the +programs already existed that were able to cover its use +cases. One reason for cut surely was its compactness and the resulting speed, in comparison to the then-bulky awk. This lean shape goes well with the Unix philosopy: Do one job and do it well! Cut was sufficiently convincing. It found its way to @@ -253,8 +253,7 @@ way. .PP Historic two-mode implementations are the ones of -System III, System V, and the BSD ones from the beginning -until the mid-90s. +System III, System V, and the BSD ones until the mid-90s. .PP Pseudo multi-byte implementations are provided by GNU, modern NetBSD, and modern OpenBSD. The level of POSIX compliance @@ -270,9 +269,9 @@ option is meaningless. .[[ http://cvsweb.openbsd.org/cgi-bin/cvsweb/src/usr.bin/cut/cut.c?rev=1.18&content-type=text/x-cvsweb-markup .LP -Standard-adhering implementations, ones that treat -multi-byte characters correctly, are the one of the modern -FreeBSD and the one in the Heirloom toolchest. Tim Robbins +Standard-adhering implementations, i.e. ones that treat +multi-byte characters correctly, are those of the modern +FreeBSD and the Heirloom toolchest. Tim Robbins reimplemented the character mode of FreeBSD cut, conforming to POSIX, in the summer of 2004 .[[ https://svnweb.freebsd.org/base?view=revision&revision=131194 . @@ -280,7 +279,7 @@ integrated this change is an open one. Maybe the answer an be found in the above quoted statement. .PP -How does a user find out if the cut on their own system handles +How do users find out if the cut on their own system handles multi-byte characters correctly? First, one needs to check if the system itself uses multi-byte characters, because otherwise characters and bytes are equivalent and the question @@ -347,7 +346,7 @@ .TS center; r r r l l l. -SLOC Lines Bytes Belongs to File tyime Category +SLOC Lines Bytes Belongs to File time Category _ 116 123 2966 System III 1980-04-11 historic 118 125 3038 4.3BSD-UWisc 1986-11-07 historic @@ -362,7 +361,7 @@ 588 830 23167 GNU coreutils 2015-05-01 pseudo-POSIX .TE .LP -Roughly four groups can be seen: (1) The two original +There are four rough groups: (1) The two original implementations, which are mostly identical, with about 100 SLOC. (2) The five BSD versions, with about 200 SLOC. (3) The two POSIX-compliant versions and the old GNU one, with a SLOC