malloc (was: making a request to IBM)

Dennis Ferguson dennis at gpu.utcs.utoronto.ca
Sun Apr 14 13:07:48 AEST 1991


In article <6644 at awdprime.UUCP> mbrown at testsys.austin.ibm.com (Mark Brown) writes:
>| The problem:  as you all remember,  malloc()  returns  NULL  only
>| when the process exceeds its datasize limit.  If malloc returns a
>| non-null pointer, the memory  may  turn  out  to  be  exceedingly
>| virtual:   there  won't  be any paging space behind it.  AIX runs
>| out of paging space when the process actually  uses  the  memory.
>| Various  processes  die.   In Info, see `List of Books', `General
>| Concepts  and  Procedures',  scroll  ~1/3  down,  `Paging   Space
>| Overview'.  See also psmalloc.c in /usr/lpp/bos/samples.  Etc etc
>| etc.
>| 
>| Personally, I think it's a bug.  If  there  is  no  memory  left,
>| malloc  should  return  a  NULL.  IBM says it's a feature,  catch
>| SIGDANGER if you don't like it.
>
>Yeah, I've heard complaints (and roses) on this one.
>The Rationale: Rather than panic the machine, we'd like for it to keep
>running as long as possible. Hence, we try to keep running at all costs,
>including doing things like this. So, when we do get close to the limit,
>we send a warning, than as we go over we start killing the biggest memory
>users. (Warning - this processes involved have been overly simplified).
>
>The Idea was to make the machine 'more reliable'. Our research led us
>to believe that many processes allocated more memory than actually used in
>page space (I think) and we used this knowledge. Understandably, many
>UNIX users either a) want the machine to panic, "like UNIX does"; or
>b) hate our algorithm for killing jobs. I also think we don't advertise/
>document the process involved enough to make it useful to users.
>
>So, do we go back to blowing up processes that allocate too much memory,
>even though that memory may actually be there by the time the process
>actually uses it? Do we go back to 'panic' when page space fills? There are
>reasonable arguments for doing this...

I'm old enough to have used vanilla Version 7 Unix when PDP-11s were in
vogue, and to be brutally frank the only Unix I can remember using
which panic'd when it ran out of memory was an early AIX on an RT, a system
which I hardly think qualifies as The Definitive Unix.  The behaviour of
AIX is, from the user's perspective, a whole lot like the behaviour of
vanilla System V Unix, which also kills off random processes when it runs
out of memory (or used to, at least, I haven't paid attention much
recently).  The only IBM value-added bit in this is the signal (to be
fair, I do understand that the backing store allocation policy is
different internally than System V, and is actually more conservative.
Looks pretty similar from the user's perspective, though).  BSD Unix
doesn't (a) panic, or (b) kill processes, I suspect what the users who
are complaining want is (c) malloc() to return NULL when the machine runs
out of memory, without panicing and without random processes being killed
(it is actually easier to do it this way than to do either what System V
or what AIX does).

Better to explain more exactly why AIX does what it does.  It's so vendors
who want to sell crufty old Fortran programs which have no way to do
dynamic memory allocation, can ship binaries with huge static arrays
compiled in for people who want to solve big problems and still have
the same binaries run on small machines to solve small problems.  To
implement this you don't allocate backing store until a page is touched,
which means malloc() can't return NULL since it can't, in general, know
if the Fortran program running at the same time is actually going to
use his pages or not.

You should understand, however, that killing off processes isn't the
"real" problem.  People have used System V machines which do this
for years without complaining because, on your typical Unix box being
put to typical uses, running out of memory/page space is a rare
occurance.  On an AIX machine, however, with its humungous kernel
and things like the compiler and loader which consume prodigious
amounts of memory when running, running out of memory can be a
daily occurance.  People don't complain about System V because they
never find out what happens when memory runs out.  With AIX, however,
your average user ends up painfully aware of how the system behaves
when memory is used up, and so he complains.

The real bug is that AIX is a memory pig.  It would be useful to
fix this one.

Dennis Ferguson
University of Toronto



More information about the Comp.unix.aix mailing list