Unique Word Counter Needed

Jan Wolitzky wolit at mhuxd.UUCP
Wed Dec 11 13:40:17 AEST 1985


> I need a way to count unique words in a document.
> Does any one have suggestions on a simple way to do this?

Try:

deroff -w filename | dd conv=lcase 2>/dev/null | sort -u | wc -l

"deroff -w" breaks the file up into single words, one per line.
"dd" converts everything to lower case (so "word" and "Word" count as
    the same thing). ("dd" is verbose, so I redirect stderr.)
"sort -u" keeps just one copy of each line.
"wc -l" counts the lines.

If you're going to run this frequently, stick it in a file, make it
executable, replace "filename" with "$*" so you can pass it file names
as arguments, and you're off.
-- 
Jan Wolitzky, AT&T Bell Labs, Murray Hill, NJ; 201 582-2998; mhuxd!wolit
(Affiliation given for identification purposes only)



More information about the Comp.unix mailing list