Has anyone running 4.2BSD had similar problems?

guy%ucla-locus at sri-unix.UUCP guy%ucla-locus at sri-unix.UUCP
Tue Nov 22 15:20:20 AEST 1983


From:  Richard Guy <guy at ucla-locus>

This is a 4.1 tale, but I suspect it relates well to your 4.2 problem,
which I attribute to the swap code being overly sensitive to disk errors:

Here at UCLA we're running some dozen 750's with a variant of 4.1bsd.
Each system has a Fujitsu disk (160Mb or 450Mb), plus the inevitable RK07
disk.  (there were ineveitable when we got the systems a year ago)  For the
most part, we avoid using the RK07's whenever possible, since the controller
doesn't buffer data very well while waiting to grab the unibus.  To attempt
to deal with the problem, we recabled all the systems so that the RK07 is at
the physical/logical front of the bus--means the bus is 20' longer now, sigh.
(This helps because devices at the 'front' of the unibus have a slight edge
over other devices 'farther' away, when it comes to bus arbitration)

We finally ran out of swap space on the Fuji's, so we added two more swap
partitions on the RK07's.  To save time/effort, we enabled both partitions
for each system, all in the same day.  Within a week, each of our systems
was crashing at least once a day with 'panic: hard i/o error in swap'.  Turns
out the RK07 just can't seem to deliver the goods when it has multiple swap
partitions on the same spindle.  We backed off to using only one RK07 partition,
and our problems have been gone for 5 months now.  A better solution would have
been to beef up the code and have it retry at least once to get the data.


A question for those running a lot of RK07's:  How have they worked out for you?
Our experience has been a minor disaster. The basic problem is described above;
others had to do with pack unreliability--'DC' packs fall apart after three
months, so we replaced most with 'EF' error free ones;  they fall apart too, but
it takes six months.  (fall apart means new bad sectors start appearing once a
week or more--real bad news if you're using it as a boot device!)  On the
positive side, DEC has been reasonably responsive about replacing the packs.
(all under maintenance, of course)  In summary, if we don't use the things, they
don't break. (very often)  As soon as they get any significant usage...they die.

richard



More information about the Comp.unix.wizards mailing list