mchk 2 --- tbuf error on 750 running 4.2 BSD

Gene Spafford spaf at gatech.CSNET
Fri Jul 26 01:55:14 AEST 1985


In article <83 at zeta.UUCP> jeb at zeta.UUCP (John Berry) writes:
>
>We are running VAX 11/750's with UNIX 4.2 BSD. We have just had DEC
>install REV 7 of the L0003 board, which we hoped would clear up the
>mchk 2 --- tbuf error problems. Well it has not. Can anyone out there
>in network land give me any insight to what is happening. DEC cannot
>find any problems when they run diagnostics. 

This is an old and frustrating problem.  I've had it show up on at least
4 750's I've worked with.  The problem is, indeed, with the L0003
board.  Let me tell you how it has been explained to me (if anyone has
a more detailed explanation, please let us know).

DEC obtains chips for the L0003 board from a couple of different
sources.  I'm not sure if they subcontract the board out to another
firm or not, but they end up with two different versions of the board
which are identical in stated specs and (almost) identical in
appearence.  As far as acceptance goes, both versions of the board
behave identically under VMS and all the regular field service
diagnostics.

HOWEVER, under Unix, due to the way certain things are done and timed,
one version of the board will repeatedly generate tbuf parity faults
that cannot be recovered from.  The fix is to replace the board with a
copy of the other version.  Once we did that, our 750's in the lab
which crashed an average of 10 times a day have only encountered one
tbuf fault in 6 months.  To get a good board may require many swaps and
trials, because I have heard someone claim that you can't identify one
of the bad boards except by unsoldering chips and looking at the lot
numbers on the underside.

I don't know the specific chips or how to identify which version of the
board you have.  Supposedly, this problem is well known in the Ultrix
support group and some field service offices (along with the RA81
read/write board glitch and the Rev4/RL02 problem, and others) as one
of the strange problems that only shows up when using Unix.  Have your
field service people contact the Ultrix support group.  It is possible
that the Ultrix group may even know of a supply of working L0003 boards
for exactly this situation.

Best of luck!
-- 
Gene "4 months and counting" Spafford
The Clouds Project, School of ICS, Georgia Tech, Atlanta GA 30332
CSNet:	Spaf @ GATech		ARPA:	Spaf%GATech.CSNet @ CSNet-Relay.ARPA
uucp:	...!{akgua,allegra,hplabs,ihnp4,linus,seismo,ulysses}!gatech!spaf



More information about the Comp.bugs.4bsd.ucb-fixes mailing list