Weird disk problems - Xylogics 753/Fuji M2382

davy at riacs.edu davy at riacs.edu
Fri Sep 1 02:09:34 AEST 1989


Hi.  We have a Sun 3/180 with two Xylogics 451 controllers and a Xylogics
753 controller.  One of the 451's has a Fuji M2351 Eagle and two CDC
9720-500 disks on it.  The other 451 has two CDC 9720-500s.  The 753 has
two Fuji M2382s on it.

Everything worked fine when we first had this.  Then, we had a spare 451
controller which we weren't sure was okay.  We swapped it into the system,
verfied it worked, and then swapped it back out, restoring the original
system.  Unfortunately, during this exercise, we broke the pin off one of
the VME/Multibus adapters, and in the process of fixing all this, might
have flipped one of the dip switches on the adapter or the controller.

Now we are seeing these messages on the two Fuji M2382 drives on the 753
controller:

xd1c: read retry (operation timeout) -- blk #64, abs blk #64
xd1c: read retry (operation timeout) -- blk #70512, abs blk #70512
xd0c: read restore (drive not on cylinder) -- blk #1400528, abs blk #1400528
xd0c: read restore (drive not on cylinder) -- blk #1540976, abs blk #1540976
xd1c: read restore (drive not on cylinder) -- blk #1400496, abs blk #1400496
xd1c: write restore (drive not on cylinder) -- blk #1400800, abs blk #1400800
xd1c: read retry (operation timeout) -- blk #70464, abs blk #70464

The messages seem to be more or less equally distributed between the two
drives (seems to depend on how much the drive is being used at the time),
and the block numbers vary a lot.  In general things seem to be okay when
only one drive is being used, but problems occur when both are in use.
(This was determined by seeing the messages when both drives were fsck'd
in the same pass, and not seeing the messages when each drive is fsck'd in
a different pass.)  Overall we get a lot of these messages, but we may go
for half an hour without getting any and then suddenly get five or six.

We have checked through all the drive manuals, controller manuals, Sun
manuals, etc.  I even got things looked up in the Sun field engineer's
manual by a friend.  As near as we can tell, everything is set up
properly, all the dip switches are in the right places, all the
controllers are in the same slots, etc.  We tried swapping 451s again.
Tried swapping 753s.  Tried changing drive cables.  Tried unplugging all
the drives except the Eagle (root drive) and the M2382s.  Tried pulling
out one 451 controller.  None of this seemd to make any difference.

So, the questions:

	1. Has anyone else seen this behavior?

	2. More importantly, does anyone know how to fix it?

Thanks in advance

Dave Curry
davy at riacs.edu
{rutgers,ames}!riacs!davy



More information about the Comp.sys.sun mailing list