Unique Word Counter Needed

Stanley Friesen friesen at psivax.UUCP
Fri Dec 13 04:47:10 AEST 1985


In article <3699 at mhuxd.UUCP> wolit at mhuxd.UUCP (Jan Wolitzky) writes:
>> I need a way to count unique words in a document.
>> Does any one have suggestions on a simple way to do this?
>
>Try:
>
>deroff -w filename | dd conv=lcase 2>/dev/null | sort -u | wc -l
>
	This looks quite inefficient, tr will do the case conversion
much more efficiently than dd, and it can also split the file into one
word lines. So try:

tr 'A-Z\011 ' 'a-z\012' < filename | sort -u | wc -l

or

deroff -w filename | tr 'A-Z' 'a-z' | sort -u | wc -l

depending on whether you wish to remove nroff macros or not.
-- 

				Sarima (Stanley Friesen)

UUCP: {ttidca|ihnp4|sdcrdcf|quad1|nrcvax|bellcore|logico}!psivax!friesen
ARPA: ttidca!psivax!friesen at rand-unix.arpa



More information about the Comp.unix mailing list