unix domain loop or mbuf loss

Alex White arwhite at watmath.UUCP
Tue Feb 21 17:54:53 AEST 1984


Subject: Loss of mbufs - or - system hang
Index:	sys/uipc_usrreq.c 4.2BSD

Description:
	Tearing down queued connections in the Unix domain upon a soclose
	is done incorrectly.  As distributed, it will cause a hang while
	things loop in the kernel; the fix distributed:
		From: madden at sdccsu3.UUCP (Jim Madden)
		Newsgroups: net.bugs.4bsd
		Subject: 4.2 IPC machine hang
		Article-I.D.: sdccsu3.1238
		Posted: Mon Nov  7 03:09:21 1983
		Organization: U.C. San Diego, Computer Center
	will fix the hang, but will cause you to loose 3 mbuf's every
	time you have a queued connection which hasn't yet been accepted
	when you do the soclose.
Repeat-By:
	First do a netstat -m. Run the following in the background:
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/socket.h>
	#include <sys/un.h>

	main()
	{
		struct sockaddr_un address;
		int s;

		s = socket(AF_UNIX, SOCK_STREAM, 0);
		address.sun_family = AF_UNIX;
		strcpy(address.sun_path, "xxx");
		if(bind(s, &address, sizeof(address.sun_family) + strlen(address.sun_path)) < 0) {
			perror("bin");
			exit(1);
		}
		listen(s, 5);
		pause();
	}
	Then run as many instances of the following as you want - 8 seems
	to be the max.
	#include <stdio.h>
	#include <sys/types.h>
	#include <sys/socket.h>
	#include <sys/un.h>

	main()
	{
		struct sockaddr_un address;
		int s, i;

		s = socket(AF_UNIX, SOCK_STREAM, 0);
		address.sun_family = AF_UNIX;
		strcpy(address.sun_path, "xxx");
		if((i = connect(s, &address, sizeof(address.sun_family) + 
			strlen(address.sun_path))) < 0) {
			perror("connect");
			exit(1);
		}
	}
	Kill the first process.  Do a netstat -m and compare, you will find that
	there are 8 extra mbuf's allocated to socket structures, 8
	to protocol control blocks, and 8 to socket addresses.
	(Note - if you didn't put in the fix from madden at sdccsu3
	you will loop in the kernel)
Fix:
	madden's fix was totally wrong - the other protocols all free up
	the socket in their cleanup routines - some to a disastrous extent
	such as udp_usrreq, which in PRU_ABORT it does a sofree right
	after invoking in_pcbdetach which also does one; and before doing
	a soisdisconnected on the socket it just freed!
	(I suspect you should delete the sofree call).
	However, for the above described problem first take out the
	fix from madden at sdccsu3.
	Then in uipc_usrreq.c, unp_drop change
		unp_disconnect(unp);
	to
		unp_detach(unp);
		sofree(unp->unp_socket);
	**DISCLAIMER: This works and fixes the above bug.  I haven't the
	foggiest idea if it'll not break various other things; for example
	the flow of control it different in the above various routines
	for datagram service and has a different set of queues and I really
	don't know if it will blow it for them.  If somebody has a better
	fix please send it to me.



More information about the Comp.bugs.4bsd.ucb-fixes mailing list