2.3.1 text corruption

Karl Denninger karl at ddsw1.MCS.COM
Thu Jun 8 03:50:21 AEST 1989


In article <1124 at jpusa1.UUCP> stu at jpusa1.chi.il.us (Stu Heiss,6312,6334,) writes:
>In article <133 at unifax.UUCP> sl at unifax.UUCP (Stuart Lynne) writes:
>-In article <26353 at lll-winken.LLNL.GOV> carlson at lll-winken.LLNL.GOV (Joe Carlson) writes:
>-}2.3.1.  Basically it appears that the in-core version of certain heavily
>-}used programs appears to get corrupted every once in a while. I believe that
>-}I have eliminated hardware trouble as the cause of this.
>-
>-I have also seen this problem on another system with a flakey swap area.
>-You might want to check that you don't have any bad blocks in your swap
>-area.
>
>I have also observed this but never considered the possibility of a disk
>problem.  I do recall some discussion about bad track remapping not
>working for the swap area.  Is this related or does anyone from sco have
>any further info?

I have checked into this, and it's not the problem.  If it was, I would
expect to see a disk error message preceeding the problems -- that has 
never occurred here.

We saw the problem too, but worse.  Not only would I get wierd crashes from
some programs, but also TRAP IN SYSTEM MODE panics!  Moving around a couple 
of boards seems to have fixed it.  If you have halfway flakey hardware, watch 
out -- you'll get all kinds of wierd problems, none of which your POST or 
diags will catch!

I believe that the tape controller was interfering with the disk controller
-- since moving the tape controller to a slot away from the drive
controllers we haven't seen the problem recur....

Check your hardware -- carefully.  I'll keep the net posted if the gremlins
come back to 'ddsw1'..... So far we're two days and counting without a
problem under heavy load.  

All this started here when I added a second controller and third fixed disk,
and put the controller too close to the tape controller board (an archive
controller... guess it's noisy or something).

The problem that appeared to be SCO not remapping bad sectors in the swap
area turned out to be a SECOND bad sector in the swap area!  We mapped that
one out too, and now all is ok in that regard -- no more fixed disk errors.

Btw: The second controller support works beautifully, and the system appears
     to multithread I/O requests with two boards in there (ie: both disk
     access lights are on at the same time!!)  Nice job SCO!

--
Karl Denninger (karl at ddsw1.MCS.COM, <well-connected>!ddsw1!karl)
Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910]
Macro Computer Solutions, Inc.		"Quality Solutions at a Fair Price"



More information about the Comp.unix.xenix mailing list