Problems with ndbm

Sun Aug 13 07:01:48 AEST 1989

In article <19044 at mimsy.UUCP> chris at mimsy.UUCP (Chris Torek) writes:
|All strings that hash to the same 32-bit value wind up in the same
|block.  It sounds as though you had strings that defeated the hash
|function.  This is not supposed to happen, but you could fiddle with
|ndbm_calchash() and see if you could get `more unique' values.

I'd be interested in finding out what the hash algorithm is.  We were
using keys something like:

0001diagram3244.a

The first 4 digits were used because records could be larger than 4096
bytes and we just chained them together by serializing a portion of
the key.  The remainder was used as a key to the complete record.
Since records could be more-or-less arbitrarily long, we tried to use
larger data sizes.  The first try was to use 2048, giving us records
far smaller than the 4096 key/data limit.  Insertion failures happened
very quickly (within a few hundred insertions) even when the
serialized portion of the key wasn't being used very much.  Dropping
the size of the data portion helped a lot; it worked most of the time
when the data was less than 128 bytes long.

While I admit that we were abusing the library (I was on a real tight
schedule at the time or I would have done it differently at the
outset), the man page didn't even imply that this kind of thing might
fail, much less fail regularly.  Our solution was to use ndbm to find
a page number of the first page in a data page file, which worked
effectively and was certainly more in line with the way ndbm should be
used :-).

I still have a lot of questions regarding ndbm, most of which I could
answer for myself if I had any way of looking at the code.

jim frost
software tool & die
madd at std.com