When does ( alarm(1) == alarm(INFINITY) ) ?

Radford Neal radford at calgary.UUCP
Fri Mar 8 06:34:43 AEST 1985


> We came across this little kernel bug the other day when trying
> to figure out why certain programs were hanging on a simple
> "sleep(1);" statement.  Although this happens on System V, I've
> been told this problem is common to most non-BSD UNIXes.
> 
>    sleep(arg)	/* A simplified view of the sleep subroutine */
>    {
> 	...
> 	alarm(arg);	/* sets  p_clktim=arg  in proc table */
> 	...		/* a critical time */
> 	pause();	/* waits for SIGALRM: --pp->p_clktim == 0 */
>    }
> 
>    If an alarm(1) is executed *just* prior to a one second time
>    tick, and if the time tick occurs before the pause(), then the
>    pp->p_clktim value hits zero in clock() before the pause() is
>    done, and the alarm signal will be missed by the pause.  This
>    results in an INFINITE sleep.  If the process is suspended for
>    more than one second prior to the pause, then alarms longer
>    than one second could hang, too.

I wrote the following fudge routine to handle a similar problem 
when using 4.1 BSD. The 4.2 signal stuff allows a cleaner (though
less efficient) solution. It only works on the VAX as written, though
adaptation to other machines would probably be possible.


/* PAUSE_FUDGE - Fudge routines to fix up pause race problem */

/* This module allows someone to wait until some condition has been
   made true by an interrupt routine without wasting cp time in a 
   polling loop. 

   The Unix 'pause' system call will suspend a process until an
   interrupt (i.e. signal) is received. This is not directly 
   usable in a wait loop however, since between a check for a 
   condition and a call of pause an interrupt may occur which
   would have made the condition true. So the following wait
   loop may hang up:

           while (!condition) pause();

   The following wait loop is to be used instead:

           for (;;)
           { jk_set_up_pause();
             if (condition) break;
             jk_maybe_do_pause();
           }

   The way this works is that jk_set_up_pause creates a routine
   which will perform a pause system call. jk_maybe_do_pause will
   execute this routine. The interrupt routine should be written
   to call the routine jk_disable_pause, which changes the routine
   created by jk_set_up_pause to do nothing instead of a pause.
   This is done by a change of a single Vax machine instruction to
   nop's. 

*/

static char pause_routine[5];		/* Pause system call or nop's */

/* Set up a routine to do a "pause" system call. */

jk_set_up_pause()
{ register char *p;
  p = &pause_routine[2];
  *p++ = 0274; *p++ = 035;	/* chmk $pause */
  *p++ = 04;			/* ret */
}

/* Execute routine to do pause system call, unless it has been nop'ed out. */

jk_maybe_do_pause()
{ (*(void (*)())pause_routine)();
}

/* Disable pause call by replacing system call with nop's */

jk_disable_pause()
{ register char *p;
  p = &pause_routine[2];
  *p++ = 01; *p++ = 01;		/* nop; nop */
}



More information about the Comp.unix.wizards mailing list