What Data Structure?

Robert Burgin burgin at ecsvax.uncecs.edu
Mon Jun 24 21:03:00 AEST 1991


I am writing a program that includes a dictionary look-up
of 24,500 English words.  Each entry in the dictionary will
contain the word itself and a string representing valid
syntactic tags for that word (noun, verb, etc.).

Would a hash table be best for the implementation?  Or
a trie?  Why?

The trie would seem to be ideal for short words but bad
for longer words.  Since the most frequent words in English
are rather short, perhaps that should be a consideration
in favor of the trie.  On the other hand, tries are
notoriously memory-intensive.

I worry about collisions in the hash table and wonder
what the ideal number of buckets would be for this case.
Not to mention having to scare up a proper hash function
for strings of variable length.

Any thoughts would be appreciated.

--rb



More information about the Comp.lang.c mailing list