apparent bug in 4.1BSD DH/DM driver

utzoo!decvax!cca!wales at UCLA-Security utzoo!decvax!cca!wales at UCLA-Security
Fri Apr 23 11:52:40 AEST 1982


From: wales at UCLA-Security (Rich Wales)
There seems to be an error in the "dhprobe" routine of dev/dh.c in 4.1BSD
on the VAX.  (For any non-Berkeley UNIX people out there, "dhprobe" is part
of Berkeley's autoconfigure-at-boot-time code; its function is to force an
interrupt on a DH/DM to see if it exists and where its interrupt vector is.)

The driver uses the following sequence to force a receiver interrupt:

		dhaddr->un.dhcsr = DH_RIE|DH_MM|DH_RI;
		DELAY(25);
		dhaddr->un.dhcsr = 0;

Maintenance Mode (DH_MM bit) is set so as to allow Receiver Interrupt (DH_RI)
to be set.  (DH_MM gets set slightly before DH_RI, so the "set" can all be
done in one instruction.)  Setting DH_RI and DH_RIE (Receiver Interrupt Enable)
together in this manner causes a receiver interrupt, of course.

The problem occurs when the register is set to zero.  DH_MM gets reset before
DH_RI, and so by the time the DH/DM gets around to resetting DH_RI, it can't,
because it isn't in Maintenance Mode any more.  As a result, DH_RI remains set.

What's worse, since a DH_RI value set while in Maintenance Mode has its own
one-shot flip-flop, DH_RI will remain set forever!  This means that the only
time a receiver interrupt will ever occur is if Receiver Interrupt Enable is
explicitly toggled by the driver!

My proposed fix for this problem is to insert a separate line into "dhprobe"
to turn off DH_RI before clearing DH_MM:

		dhaddr->un.dhcsr = DH_RIE|DH_MM|DH_RI;
		DELAY(25);
	#ifdef BUGFIX
		dhaddr->un.dhcsr &= ~DH_RI;
	#endif BUGFIX
		dhaddr->un.dhcsr = 0;

Since 4.1BSD sets the silo alarm level to 16 and polls the DH/DM off the clock
at a 60-Hz rate, this bug was not immediately obvious.  (However, every once in
a while we would get a "silo overflow" message on the console, and I assume the
reason was because the DH/DM refused to interrupt.)  When I tried to modify the
driver to read characters solely on the basis of interrupts -- by setting the
silo alarm level to zero and #ifdef'ing out the "dhtimer" call in the clock
routine -- the problem became painfully apparent.

By the way, if you are using the ABLE DH/DM (as we are here at UCLA), it would
seem better to use a silo alarm level of zero and run off receiver interrupts
rather than the clock.  This is because the ABLE DH/DM reportedly has a 20-ms
delay between the time that the silo alarm level is exceeded and the time it
generates a receiver interrupt.  (It will still interrupt immediately if 16
input characters have accumulated in the silo.)  This "delay" feature would
allow a burst of input characters to be processed by a single interrupt.

-- Rich Wales




More information about the Comp.unix.wizards mailing list