memory allocation

Sun Sep 18 20:16:44 AEST 1988

In article <33422 at cca.CCA.COM> g-rh at XAIT.Xerox.COM (Richard Harter) writes:
: In article <778 at proxftl.UUCP> bill at proxftl.UUCP (T. William Wells) writes:
: >In article <33184 at cca.CCA.COM> g-rh at CCA.CCA.COM (Richard Harter) writes:
:
: >: Writing a fast "best-fit" allocator is an interesting exercise in
: >: programing.
:
: >Yes.  A exact- + worst-fit (a close relative of best-fit)
: >allocator that I wrote as part of a real time kernel turned out
: >to be one of the largest sections of the kernel; it was easily
: >the most complicated.
:
: It shouldn't be all that bad -- the best fit allocator I am using
: currently runs about 200 lines of C, excluding comments.    That includes
: both the allocate and the free routine.  However it doesn't have any
: tuning code or any special code for handling small blocks.

Well, mine did have that kind of code; it also had grunge for
handling buffer pools and a few hacks to make it fit with the
rest of the kernel.  Anyway, 200 lines of C code, translated into
assembler, would have been a lot of code compared to any of the
other parts of the kernel.

: >Two disadvantages come to mind for best-fit.  The first is the
: >memory overhead.  If you keep track of the block size and make
: >your minimum block size be two pointers, you can keep the
: >overhead per allocated block to a single bit.  Of course, there
: >is also the overhead of allocated but unused space; this is often
: >small compared to a word per block, especially if the typical
: >allocation amount is small.
:
: I don't understand this comment.  If you mean to say that you can get
: by with a single bit to mark whether a block is allocated or not, sure.
: But the block size word will (on a ptr per word machine) take the same
: space as a ptr.
:
: In any case, yes, there is more memory overhead for a best-fit allocator.
: Personally I think this is a moot point.  The losses for fragmentation
: and oversize allocation are much more signifigant unless you have a heavy
: load of short requests.  However small block requests are best handled by
: having dedicated blocks for the small blocks.

Let me amplify, then.  Most allocation schemes make use of a word
of data for each block.  However, with the buddy system, if you
are able to pass the size of the block to the free routine, you
can get away with one bit of information in each allocated
block.  Generally, this size is either known at compile time or
is easily computable.  And the bit can usually be buried in the
data structure without costing anything.

In comparing the two systems, assuming one has a specific
application in mind, one should count the word of overhead per
block plus the fragmentation for the usual systems, vs.  the
wasted space plus the fragmentation for the buddy system.

There are also a few minor points favoring the buddy system; one
is that it can do a much better job of reducing fragmentation
(though you milage almost certainly will vary :-) Another is
that, if you know that you have a buddy system allocator, you can
sometimes choose which structures things go in so that the
structure sizes are the appropriate ones for your allocator.

Another point, which I haven't given much though to, but I though
I'd toss out to see what people think, is that a buddy system
allocator can go very well with a paged memory system; wasted
space at the end of small blocks is irrelevant and wasted space
at the end of large blocks wastes address space but does not
waste much working set memory.

: One point which I feel strongly about is that most allocators use the
: blocks (free and allocated) to hold control information.  I think that
: this is a mistake.  I put all control information in structures in a
: separate node space.  This adds overhead, but it makes life a lot safer.
: Array indexing errors do not damage the allocator.  Free's can be checked
: for validity.

Depends on what you want.  If you want a bulletproof system, then
you should definitely segregate the control information.  If you
want a high-performance system, then you have to put the control
information in an unsafe place.  That's life.

: >The second disadvantage is fragmentation: exact-fit + worst fit
: >is supposed to create less fragmentation than best-fit.  I have
: >not done the experiment, though.
:
: I have -- I ran an extensive series of tests on allocator strategies a
: number of years ago.  What it comes down to is that you can get the results
: you want by picking your assumptions.  Exact fit (whether EF+WF) is important;
: it defeats the 2/3 rule.  Given that, different size distributions and
: different allocate/free patterns give different results.

Yeah, and that really is the bottom line: you have to pick the
best allocator based on an analysis and evaluation of the
specific task to be performed.  That makes life difficult for
those writing general purpose allocators: they have no idea what
the allocation patterns are likely to be.  That suggests that one
in that position should find allocators with reasonable
worst-case performance, and then pick the fastest or the simplest
of those schemes.

---
Bill
novavax!proxftl!bill