stopped jobs don't always disappear (after logout)

Wayne Berke berke at csd2.UUCP
Thu Nov 13 08:25:00 AEST 1986


Most UNIX kernels that support job control have a provision
for cleaning up stopped jobs when the user insists on logging
out without taking care of them.  4.2 does this within exit()
by checking if any of the exit'er's children have been suspended.
If so, it sends each such child a SIGHUP followed by a SIGCONT.

A problem occurs in cases when programs fork/exec with the
parent ignoring SIGHUP during its wait().  If this job is suspended,
peculiar things happen when the user logs out.  When the login
shell exits, the kernel notices the stopped child and sends it a
SIGHUP which is ignored.  The subsequent SIGCONT simply changes the
process state from stopped to wait/blocked.  The child's child is never
sent any signal.  This results in both processes hanging around with
the parent blocked and waiting for a child which is suspended.  The
simple program below will exhibit this behavior:

	#include <signal.h>
	main()
	{
		int status;

		if (fork()==0) {
			execl("/bin/cat", 0);
			exit(0);
		} else {
			signal(SIGHUP, SIG_IGN);
			wait(&status);
		}
	}

Unfortunately, there is also a commonly used utility which also does
this, namely f77.  The f77 driver for our system (maybe this has been
changed in the 4.3 distribution?) has the parent ignore SIGHUP, SIGQUIT,
SIGINT, and SIGTERM each time it does a wait().  Thus, when a careless
user CTRL-Z's an f77 job and logs out after ignoring the csh's
"You have stopped jobs" message, the f77 and whatever child has been
spun off remain and take up space in the proc table until the system
crashes or someone explicitly sends them a SIGKILL.  I guess the answer
is not to ignore SIGHUP.  By contrast, cc only ignores SIGTERM and SIGINT
when it waits so this doesn't occur.



More information about the Comp.unix.wizards mailing list