Can a process stop with a locked inode?

Chris Torek torek at elf.ee.lbl.gov
Fri Jun 28 22:06:06 AEST 1991


In article <19416 at rpp386.cactus.org> jfh at rpp386.cactus.org
(John F Haugh II) writes:
>... The reason that a level like "PZERO" exists is to distinguish between
>things that can happen real fast (short term sleeps) versus real slow
>(longer term sleeps).  PINOD is as low as it is to insure that the
>sleeping process 1) isn't interrupted (including SIGSTOP or SIGTSTP)
>and 2) gets the CPU back. ...

This is largely true, but changing.

SunOS borrowed an idea from System V; the current 4BSD kernel has a
rather different approach to the same problem.

In Version 6 and many of its derivatives, the `sleep priority' is
intimately tied to the action the kernel takes when a signal is delivered
to a sleeping process.  One of two things happens:

	a. priority < PZERO: the signal is OR'ed into the `pending signals'
	   mask.  The sleep continues sleeping.
	b. priority > PZERO: the signal is OR'ed in, but the sleep is
	   aborted and the process is resumed via a longjmp to u.u_qsave.
	   Normally this goes back to code in syscall() that returns EINTR.

(I cannot recall offhand the boundary condition for PZERO itself.)

Thus, you *can* sleep at priority > PZERO with something locked, provided
you arrange in advance to catch longjmp()s out of sleep().

Newer kernels often have something called PCATCH: if you set PCATCH
on a call to sleep(), signals never longjmp() out; instead, they return
EINTR.  A `successful' sleep returns 0.  In the last SunOS I recall
(3.5? 3.2? something like that), longjmp() out of sleep still occurred
in some cases (that is, the mechanism was the union of all past approaches).

The current 4BSD kernel has taken a more radical step.  u.u_qsave is
gone.  *All* sleep calls are uninterruptable unless you set PCATCH.
Thus, if you do not set PCATCH, sleep returns zero; if you do, you must
check for error returns.  All functions unwind the call stack in the
usual way, and it is now impossible to longjmp() past an unlock.  There
are no (zero, none) calls to setjmp() or longjmp() in the kernel.
(Actually, sleep() has been replaced with tsleep(), which also takes a
timeout.)  Perhaps surprisingly, this generally speeds up system calls,
as the setjmp() in syscall() was relatively expensive and usually
unnecessary.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek at ee.lbl.gov



More information about the Comp.unix.internals mailing list