4.2BSD non-blocking sockets and selects
Brian Thomson
thomson at utcsrgv.UUCP
Wed Dec 21 09:59:28 AEST 1983
Index: sys/uipc_socket.c h/socketvar.h 4.2BSD
Description:
If you do a select() for writing on a non-blocking
SOCK_STREAM socket, and there is some send queue buffer space
available, it will tell you the socket can be written.
But sosend() insists that all writes to non-blocking sockets
be atomic, and will return EWOULDBLOCK if there is not enough
buffer space for the entire write to go in one shot.
This behaviour is OK for non-stream sockets, but streams
should allow partial writes. A couple of distributed utilities
agree with me ...
Repeat-by:
Both rlogind(1) and telnetd(1) are prepared for partial
socket writes. Try this:
% rlogin localhost
< message of the day >
% cat /usr/dict/words
<blah>
<blah>
<blah>
~^Z (i.e. suspend the rlogin locally)
Stopped
% jobs
[1] Stopped rlogin localhost
%
An iostat at this point will show (unless you happen to exactly fill
the send queue) that your system is being eaten alive by rlogind.
Fix:
Allow partial writes to non-blocking sockets unless the
underlying protocol is atomic. This is consistent with the
behaviour of non-blocking ttys, which are a good model
for stream-oriented sockets.
In file /sys/h/socketvar.h, change:
#define sosendallatonce(so) \
(((so)->so_state & SS_NBIO) || ((so)->so_proto->pr_flags & PR_ATOMIC))
to
#define sosendallatonce(so) \
((so)->so_proto->pr_flags & PR_ATOMIC)
In file /sys/sys/uipc_socket.c, routine sosend(), diff -c shows:
***************
*** 281,286
register int space;
int len, error = 0, s, dontroute;
struct sockbuf sendtempbuf;
if (sosendallatonce(so) && uio->uio_resid > so->so_snd.sb_hiwat)
return (EMSGSIZE);
--- 287,293 -----
register int space;
int len, error = 0, s, dontroute;
struct sockbuf sendtempbuf;
+ int sentsome = 0;
if (sosendallatonce(so) && uio->uio_resid > so->so_snd.sb_hiwat)
return (EMSGSIZE);
***************
*** 324,329
goto release;
}
mp = ⊤
}
if (uio->uio_resid == 0) {
splx(s);
--- 331,337 -----
goto release;
}
mp = ⊤
+ sentsome = 1;
}
if (uio->uio_resid == 0) {
splx(s);
***************
*** 336,342
if (space <= 0 ||
sosendallatonce(so) && space < uio->uio_resid) {
if (so->so_state & SS_NBIO)
! snderr(EWOULDBLOCK);
sbunlock(&so->so_snd);
sbwait(&so->so_snd);
splx(s);
--- 344,353 -----
if (space <= 0 ||
sosendallatonce(so) && space < uio->uio_resid) {
if (so->so_state & SS_NBIO)
! if(sentsome)
! { splx(s); goto release; }
! else
! snderr(EWOULDBLOCK);
sbunlock(&so->so_snd);
sbwait(&so->so_snd);
splx(s);
Reservation:
You should probably HOLD OFF installing this change until
it gets batted about the net a bit. The original behaviour appears
to have been quite deliberate, and although I do think it's wrong,
I'd like to give someone in the know a chance to explain the
unobvious reason that it was right in the first place!
--
Brian Thomson, CSRG Univ. of Toronto
{linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!thomson
More information about the Comp.bugs.4bsd.ucb-fixes
mailing list