Unix Error Messages at Crash Time

Griff Smith ggs at ulysses.UUCP
Thu Dec 22 13:46:28 AEST 1983


With regard to the following:

    >Is there anyone out there who knows what Unix error messages at crash time
    >mean? I am talking about the ones not explained in section 8 of volume 1.
    >Messages like "panic: mba, zero entry", "unit 0: random interrupt", or 
    >"machine check". 

I suppose a direct reply would have been more appropriate, but with a path
like "...!sri-unix!ben%brandeis at csnet-relay" a mail response wouldn't stand
a snowball's chance in Hell of getting there.

"panic: mba, zero entry" happens under 4.1BSD and 4.2BSD when you read a
mag tape that has a hard read error.  It is caused by some brain damage in
mt.c that makes it assume that mba.c knows how to "read backwards".  When
mt.c gets the "read opposite" status from the tape controller, it passes
a "read backwards" request to mba.c, along with the buffer address and
buffer size.  Since this is "read backwards", mba.c is supposed to map
the pages of the buffer into the mba address space and then set the
initial input address to be the end of the buffer.  Unfortunately, it
leaves the starting address unchanged.  Tape input starts at the beginning
of the buffer, erases any innocent static or stack variables in front of
the buffer until it reaches the beginning of the page, then falls off the
end of the world.  If you are lucky, your process then aborts with a
strange error message resulting from using the text in those variables
as binary numbers.  If you are unlucky, the kernel is deranged and panics
when it tries to use a bent table.  As far as I can tell, you get the
panic if the input buffer is smaller than the input block and you get the
mangled static area if the buffer is larger than the input block.

"unit 0: random interrupt" should be "unit 0: non-data transfer error
interrupt, error status = xxxxxx".  I changed my mt.c to be something
like that, and found that the error status code is usually 32 (base 8).
My DEC tape controller manual says this means "TM fault B", otherwise
known as "I am broken, please fix me".  The error code in the LED display
inside the TM front panel gives further help to the DEC CE that you call
in when this happens.

I intend to fix these problems soon, unless someone posts reasonable
solutions and saves me the trouble.  Whether the fixes can escape the
proprietary black hole of AT&T Bell Laboratories is another matter.
-- 

Griff Smith	AT&T Bell Laboratories, Murray Hill
Phone:		(201) 582-7736
Internet:	ggs at ulysses.uucp
UUCP:		ulysses!ggs



More information about the Comp.unix.wizards mailing list