Checkpointing and the Rollback of Processes (SUMMARY)

Mike Laman laman at ivory.SanDiego.NCR.COM
Wed Aug 31 06:13:49 AEST 1988


In article <484 at uvicctr.UUCP> rside at uvicctr.UUCP (Robert Side) writes:
>First of all. I would like to *thank* all the people that responded
>to my problem. I tried to reply to everyone but I guess I have
>not mastered the mailing program on our system yet.
	:
	:
	:
>I originaly wrote on checkpointing and the rollback of processes
>
>> I have a problem I hope somebody can help me with.
>> 
>> Long Summary:
>> I would like to be able to *checkpoint* a running process
>> so that the process, which is under user control, can be rollbacked to a
>> given checkpoint and restarted.
>> 
	:
	[ Deleted the rest of his "original" message ]
	[ Deleted a couple messages Robert included ]
	:

>uunet!jetson.UPMA.MD.US!john (John Owens)  writes
>
	:
	[ Deleted one suggestion from John's message to Robert ]
	:
>Otherwise, you could try to write the whole data segment out to disk to
>checkpoint and do a setjmp().  Then to rollback, you could read the
>data segment back in and longjmp().  I don't know if it would work, but
>it sounds good.
>
	:
	[ Deleted a couple messages Robert included ]
	:
I just wanted to add my two cents worth on the subject of writing out
an arbitrary area of data in one process and reading it back in in
another process in the future.  It is possible.  Afterall, that's
how "rogue" saves a game.  I just wanted to warn you of a nonobvious
problem you can encounter.  If the area you are saving contains various
stdio library data and you use the stdio library for writing out the data,
you will have a problem.  When you write out the ``_iob'' table, it will
show that a slot (among others, of course) is in use.  Namely, the one
you're using to write out the data.  Eventually you'll finish writing out the
data and (as a good programmer :-)) ``fclose()'' the file.  Well, that
frees the stdio ``_iob[]'' slot and closes the file descriptor, but be
careful, when you read the data in later (in someother process probably).
The ``_iob[]'' slot was "open" at the time the data was saved.  After
you have restored all the data, you need to "fclose()" that once open
stream (which is really closed) used to write out the data, so
you can free up the slot.  Otherwise, each time you restore from a saved
image, you'll keep eating up a stdio ``_iob[]'' slot.  On many systems
you'll get to save the image about 17 times (20 - 3 (stdin, stdout, stderr)).
Then your "fopen()"'s will fail because your ``_iob[]'' table is full.

And don't worry, I'm not even going to mention the lack of portability
for systems with non contiguous data space.  Hmmm.  I guess I did.

Mike Laman

P.S.  When you think about this you really start to worry about the guts
of various libraries with their static (only once) initialized data.
You'd better hope they are initialized properly.  Example: terminfo curses -
don't play a restored game of rogue on a different terminal!  The interanlly
static data is for the original terminal type!  You're getting into a
headache, generally speaking with this approach.



More information about the Comp.unix.wizards mailing list