Unique Word Counter Needed
Jan Wolitzky
wolit at mhuxd.UUCP
Wed Dec 11 13:40:17 AEST 1985
> I need a way to count unique words in a document.
> Does any one have suggestions on a simple way to do this?
Try:
deroff -w filename | dd conv=lcase 2>/dev/null | sort -u | wc -l
"deroff -w" breaks the file up into single words, one per line.
"dd" converts everything to lower case (so "word" and "Word" count as
the same thing). ("dd" is verbose, so I redirect stderr.)
"sort -u" keeps just one copy of each line.
"wc -l" counts the lines.
If you're going to run this frequently, stick it in a file, make it
executable, replace "filename" with "$*" so you can pass it file names
as arguments, and you're off.
--
Jan Wolitzky, AT&T Bell Labs, Murray Hill, NJ; 201 582-2998; mhuxd!wolit
(Affiliation given for identification purposes only)
More information about the Comp.unix
mailing list