IBM PS2M80 vs. SCO XENIX

GEO%LOYVAX.BITNET at cunyvm.cuny.edu GEO%LOYVAX.BITNET at cunyvm.cuny.edu
Sat Mar 25 01:54:12 AEST 1989


This may be of interest to those caught between hardware and
software vendors.  The context is: HW -- IBM PS/2 mod 80, 70mb
drive, ESDI controller, 4mb memory; SW -- SCO Xenix 2.2.6.

On start-up one morning, the memory check halted at 1662 with
error messages for configuration change and unset clock/calendar.
The reference disk showed 4096k available, 1662 usable, but 0 ESDI
controllers available.  Autoconfigured and reset clock/calendar.

On restart, the Xenix hard disk boot proceeded up to the point at
which the hard disk configuration is established.  At this point
the machine hung with the hard disk activity light on and the
message:

panic: memory failure - parity error

Xenix would boot fine from a floppy, but attempts to mount the
hard disk gave the same lock-up, error message, and, as a little
bonus, would also trash the boot floppy.

After a few days of weeping and teeth-gnashing, IBM replaced just
about everything in the machine, including memory and system
board.  The disk was given a low-level format.  DOS booted and ran
fine.  Overnight looping through all diagnostics for over 14
hours produced no errors.  The IBM CE support desk in Atlanta said
they had heard of "6 or 8" such situations.  I was assured it was
a software problem.

SCO assured me that the my error message is "one of the more
straight-forward ones."  The tech said that if my machine were AT-class,
he'd be certain it was a memory failure; since the machine's a
PS/2-80, he was less certain -- said this machine give the message
for other, unknown reasons.  In any event, it was definitely a
hardware problem, he said.

So there I was, caught between IBM & SCO with no leverage at all!
To shorten the story, a day or so later, a more knowledgeable SCO
tech (Tracy by name) said the problem was probably due to DMA
contention.  She suggested 3 possibilities: (1) the low-level
format of the hard disk had somehow gone bad, (2) DMA arbitration
conflict between the hard disk and something else, and (3) improper
DMA arbitration set by the autoconfigure routine.

Of the three, the third was the most likely one.  Tracy didn't
know what the proper DMA arb level was, so I cycled through them
all (0-7).  The autoconfigure had established level 6.  Turns out
my configuration works ONLY on level 5.  Once I reset DMA arb to 5,
reloaded from backup, re-created device drivers, etc., etc., why,
it was as if nothing had ever happened! :-)

The thing is, neither SCO nor IBM knew to rule out DMA arb
problems until quite late in the game.  The moral: keep backups,
certainly.  It also convinced me that a service contract -- at
some $260/year -- is preferable to paying for a new system board
at $190/hour for labor and $2160 for parts.  Didn't have to do
that this time, but it was just too close a call.


George Wright, GEO at LOYVAX.BITNET, Loyola College, Baltimore MD



More information about the Comp.unix.wizards mailing list