improved 4.2BSD signal(2) library routine

Robert Elz kre at mulga.SUN
Mon Dec 26 13:56:28 AEST 1983


A warning to anyone who uses either of the routines posted
to the net to simulate the old signal handling style on 4.2.

Those routines assume that whenever a signal is received with
the pc pointing at a CHMK (trap) instruction, that a system call
must have been interrupted.  It is possible (though not likely)
that the process was just about to execute the sys call when the
signal occurred, and the system call could be one of those
which is not supposed to be interrupted (EINTR could never
have normally occurred).  Thus, some system calls might end
up returning EINTR when they should not.

This is not likely, and is certainly no worse than many other
race conditions that I guarantee that any program attempting to
catch and handle signals in pre 4.2 bsd unix will have other
problems, more likely to occur, and with worse effects.
(Exculding progs using the 4.1 jobs library, with which, it was almost
possible to survive in simple cases, with a lot of care).

As an example of the type of problem that the old signal mechanism
causes, which can't be avoided ...

We have a rather sluggish, local Australian 68K system.
On that, its easy to log yourself out by pressing the 'DEL'
(interrupt) key twice in succession.  The reason: of course,
the SIGINT has been delivered to your shell (Bourne shell, but
tht is irrelevant) but it hasn't been swtch'd to yet (or perhaps
it has, but hasn't had time to reset the handling of SIGINT).
The second interrupt signal finds that the handler for
SIGINT is SIG_DFL, and the shell is killed.  Bye bye!

While some of you may be able to accomodate such effects,
and explain them to your users, without being thrown out of
the room, bodily, I cannot.  Since this is a problem of
definition of the signal routines, ONLY incompatibility
with existing programs can fix it.  (I do admit, that a
way could have been found to allow the old handling in
parallel with the new, for ease of transition, but that
tends to mean lack or transition, & I'm glad it wasn't done).
And like the filesystems, once you are going to make
something incompatible, its best to make it VERY incompatible,
and attempt to get it right, once and for all.

Note, there are still some problems with the signal handling.
eg: a sys call that does a slow write won't be restarted
if some data has been transferred; and there's no way to determine
how much data was written before the sys call was interrupted.
(This problem occurs in the old signal handling routines too).
However, this is now basically a problem of implementation, it
should be able to be fixed sometime, without changing the
definitions again.

Thank heaven that there is someone out there with the
bravery to correct the problem, and try to get it done right.

Robert Elz,	Comp Sci, Univ of Melbourne.		decvax!mulga!kre

A merry Christmas and a happy new year to you all.



More information about the Comp.unix.wizards mailing list