fix to TCP hangs and slow transfers

Charles Hedrick hedrick at topaz.RUTGERS.EDU
Mon Mar 3 15:55:54 AEST 1986


First, I should warn you that the problem I am about to describe was
observed on a Pyramid 90X.  However a quick perusal of other source
suggests that the problem is probably present in our Sun 2.0 source
and in 4.3.  So I conclude that this problem is generic to 4bsd
implementations.  However symptoms may or may not be present on other
systems, depending upon the details of how they use the variable
rcv_adv.

The symptom is that connections attempting to send data from a DEC-20
or Symbolics 3600 to Unix hang.  Or connections from any kind of
system may become super-slow (like about 1000bit/sec on an Ethernet).

I now believe that the problem is due to incorrect initialization of
rcv_adv.  This variable indicates the receive window advertised to the
other end.  However it is not a window size.  It is a sequence number,
namely the largest sequence number that the other end has ever been
authorized to send.  This is sort of a "high water mark", since
silly-window prevention can cause the window to shrink.  In such cases
rcv_adv does not become less.  Except when this window shrinking has
happened, the actual advertised window size is rcv_adv - rcv_nxt.

Now for the bug.  rcv_adv is set in only one place, in tcp_output:

	if (SEQ_GT(tp->rcv_nxt+win, tp->rcv_adv))
		tp->rcv_adv = tp->rcv_nxt + win;

This works fine, except for the first time.  rcv_adv is initialized to
zero.  Unfortunately, sequence numbers are compared using a modulo
arithmetic, such that some sequence numbers are actually less than
zero.  If a connection has such "negative" sequence numbers, then this
test always fails, and rcv_adv is never updated.  rcv_adv is used only
one place, in tcp_output to calculate when to issue window updates.
For connections that have bad values of rcv_adv, the effect can be
missing window updates.  If the TCP implementation on the other end is
correct, it will eventually issue a probe, and the connection will be
restarted.  However such connections may be mysteriously slow.  If the
TCP implementation at the other end does not issue zero-window probes
(TOPS-20), or issues them incorrectly (Symbolics, apparently -- there
is some evidence that their probe has a data length of zero), then the
connection will simply hang.  Different Unix versions may use slightly
different tests for when to do window updates.  So the probability of
hanging will depend upon the implementation.

The fix that I recommend is to change the definition of tcp_rcvseqinit
so that it initializes rcv_adv as well as rcv_nxt.

#define	tcp_rcvseqinit(tp) \
	(tp)->rcv_nxt = (tp)->irs + 1; (tp)->rcv_adv += (tp)->rcv_nxt

The obvious code would be (tp)->rcv_adv = (tp)->rcv_nxt.  However
sometimes rcv_adv is given a non-zero value before the sequence
numbers are initialized.  So it seems safer to use the code above.



More information about the Comp.bugs.4bsd.ucb-fixes mailing list