SUN-3s talking to SUN-2s with 3COM boards

Thu May 8 13:40:07 AEST 1986

NFS sends blocks which are either 4Kbytes or 8Kbytes (depending upon
the function).  At a lower level, these are are turned into packets
(1.5Kbytes if you are using normal Ethernet parameters).  All of the
packets generated from a given block are queued up to the output at
the same time.  The result is a burst of between 3 and 6 packets with
almost no time between them.  This code bypasses much of the normal
TCP/IP code, for efficiency.  The 3Com boards have only two buffers,
and they are on the board.  In order to deal with large bursts, Unix
must copy one buffer into mbufs while the other one is being filled
from the network, and it must finish this process before the next
packet shows up.  A standalone 68010 with nothing to do but empty 3Com
buffers, having zero interrupt latency, might just be able to do this.
But a 68010 running Unix certainly cannot.  The result is that at
least one of the packets in the burst is dropped.  Because of the
design of NFS, acknowlegements and retransmissions occur on the basis
of the 4K or 8K blocks, not the individual packets.  So if any one
packet is dropped, the whole burst is lost and must be retransmitted.
Thus you must receive every packet in a burst correctly.

The solution is not exactly to slow down the Ethernet controller.
Rather, under version 3.0 there is a parameter you can specify in the
mount that gives a maximum block size.  You simply limit NFS to 2K
blocks.  Then its bursts are never longer than 2 packets. This
increases CPU overhead slightly, because certain processing must be
done once per block, and you are now sending twice as many blocks.  It
also decreases throughput slightly.  It's not clear that this is
really a big deal.  This could be considered "slowing down the
controller", but it is probably better described as "detuning NFS".
Note that this is done for each mount.  So only mounts between
3Com Sun 2's and Sun 3's need to have this parameter.  Everything
else on both machines will run as usual.

The problem does not afflict normal TCP use because the TCP code in
the kernel isn't nearly as fast.  It generates packets one at a time,
rather than in bursts.