Bug in motorola SysV Rev II Bourne Shell

Tom Murdock tom at enmasse.UUCP
Fri May 16 03:18:14 AEST 1986


	Has anyone out there had a problem with the Motorola System V 
version 2 shell (/bin/sh) getting 

	Memory fault - core dumped 

messages when trying to run the :mkcmd script to make certain unix utilities.
It seems likely that other large recursive shell scripts might also run into 
the same problem (I think I have seen it with lorder).  Running with a large 
environment may help reproduce this problem.  I would be interested if anyone 
else has seen this bug because I am trying to determine whether the problem 
is a generic sh bug as I suspect or just one specific to our hardware or
System V implementation.

	The symptoms seem to be that 
	1)  You are using your stack when you try and get more memory with 
the addblok() routine in blok.c.
	2)  The value used to set your brk() limit to is a random value from
the new memory.
	3)  If this value is some large garbage value (sometimes it is 0 or a small number) your sbrk will fail but due to no error checking the shell thinks 
it has successfully added a large block of memory.
	4)  The shell will then get a memory fault if it tries to access
memory beyond its real limit.  If it always stays within its real limit 
it will run fine and the error is not noticed.

Putting debugging statements in our version 1 shell also indicates that it
occasionally uses the garbage value(s) and thinks its sbrk limit is super
high, although it doesn't seem to exceed its real limit to get the error as
often as the new version.  

Another possible symptom of this problem is that
we have occasionally seen run away shells that have acquired huge amounts
of memory.  It seems feasible that they found a garbage value that was large
but within the systems limit, and they run for a while well holding a huge
chunk of memory, causing massive swapping activity, etc...

The related code is in blok.c.   It seems that when the stack is in use
a pointer is set up beyond the stack.  This pointer is later used by
all paths of the code presuming that the value at its address is a value 
that the shell should own up to (i.e. should be sbrk()'ed to).  My theory
is that this value is not set up when the pointer is gotten from beyond
the stack and the correct fix is to simply make sure it is set up or to
just skip the sbrk() in this case.  This fix seems to correct this problem
although I am unsure enough of what this code is really trying to do that
I am unsure of whether I have broken something else.  If someone has a good
idea of how this routine is supposed to work, I would like some feedback
on whether this is the correct fix.



More information about the Comp.unix.wizards mailing list