4.2BSD non-blocking sockets and selects

Brian Thomson thomson at utcsrgv.UUCP
Wed Dec 21 09:59:28 AEST 1983


Index: sys/uipc_socket.c h/socketvar.h 4.2BSD

Description:
	If you do a select() for writing on a non-blocking 
    SOCK_STREAM socket, and there is some send queue buffer space
    available, it will tell you the socket can be written.
    But sosend() insists that all writes to non-blocking sockets
    be atomic, and will return EWOULDBLOCK if there is not enough
    buffer space for the entire write to go in one shot.

	This behaviour is OK for non-stream sockets, but streams
    should allow partial writes.  A couple of distributed utilities
    agree with me ...

Repeat-by:
	Both rlogind(1) and telnetd(1) are prepared for partial
    socket writes.  Try this:
	% rlogin localhost
	< message of the day >
	% cat /usr/dict/words
	<blah>
	<blah>
	<blah>
	~^Z		(i.e. suspend the rlogin locally)
	Stopped
	% jobs
	[1] Stopped		rlogin localhost
	% 
    An iostat at this point will show (unless you happen to exactly fill
    the send queue) that your system is being eaten alive by rlogind.

Fix:
	Allow partial writes to non-blocking sockets unless the
    underlying protocol is atomic.  This is consistent with the
    behaviour of non-blocking ttys, which are a good model
    for stream-oriented sockets.
    In file /sys/h/socketvar.h, change:

	#define	sosendallatonce(so) \
            (((so)->so_state & SS_NBIO) || ((so)->so_proto->pr_flags & PR_ATOMIC))
  
    to
	#define	sosendallatonce(so) \
            ((so)->so_proto->pr_flags & PR_ATOMIC)


    In file /sys/sys/uipc_socket.c, routine sosend(), diff -c shows:

	***************
	*** 281,286
		register int space;
		int len, error = 0, s, dontroute;
		struct sockbuf sendtempbuf;
	  
		if (sosendallatonce(so) && uio->uio_resid > so->so_snd.sb_hiwat)
			return (EMSGSIZE);

	--- 287,293 -----
		register int space;
		int len, error = 0, s, dontroute;
		struct sockbuf sendtempbuf;
	+ 	int sentsome = 0;
	  
		if (sosendallatonce(so) && uio->uio_resid > so->so_snd.sb_hiwat)
			return (EMSGSIZE);
	***************
	*** 324,329
				goto release;
			}
			mp = ⊤
		}
		if (uio->uio_resid == 0) {
			splx(s);

	--- 331,337 -----
				goto release;
			}
			mp = ⊤
	+ 		sentsome = 1;
		}
		if (uio->uio_resid == 0) {
			splx(s);
	***************
	*** 336,342
			if (space <= 0 ||
			    sosendallatonce(so) && space < uio->uio_resid) {
				if (so->so_state & SS_NBIO)
	! 				snderr(EWOULDBLOCK);
				sbunlock(&so->so_snd);
				sbwait(&so->so_snd);
				splx(s);

	--- 344,353 -----
			if (space <= 0 ||
			    sosendallatonce(so) && space < uio->uio_resid) {
				if (so->so_state & SS_NBIO)
	! 				if(sentsome)
	! 					{ splx(s); goto release; }
	! 				else
	! 					snderr(EWOULDBLOCK);
				sbunlock(&so->so_snd);
				sbwait(&so->so_snd);
				splx(s);



Reservation:
	You should probably HOLD OFF installing this change until
    it gets batted about the net a bit.  The original behaviour appears
    to have been quite deliberate, and although I do think it's wrong,
    I'd like to give someone in the know a chance to explain the
    unobvious reason that it was right in the first place!
-- 
			Brian Thomson,	    CSRG Univ. of Toronto
			{linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!thomson



More information about the Comp.bugs.4bsd.ucb-fixes mailing list