annotate stuff/wordfreq @ 57:eb0815f21f04

added some auxiliary files: e.g. statistics
author markus schnalke <meillo@marmaro.de>
date Mon, 20 Oct 2014 07:09:57 +0200
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
57
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
1 #!/bin/sh
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
2 #
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
3 # print word frequency
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
4
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
5 deroff "$@" |
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
6 tr -c 'A-Za-zÄÖÜäöüß-' '\n' | tr A-ZÄÖÜ a-zäöü |
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
7 sed '/^ *$/d'| sort |uniq -c | awk '
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
8 {sum+=$1; a[$2]=$1;}
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
9 END {for (x in a) {printf("%s\t%.2f\t%4d\n", x, a[x]/sum, a[x])} }
eb0815f21f04 added some auxiliary files: e.g. statistics
markus schnalke <meillo@marmaro.de>
parents:
diff changeset
10 '| sort -nr -k 3