Monitoring processes and machines (program itself and central)

Brian J. Hafner hafner at mysost.cs.wisc.edu
Sun Mar 31 11:06:52 AEST 1991


In article <3545 at inews.intel.com> khougland at sedona.intel.com writes:
>
>I'm intrested in being able to keep tabs on our whole domain.  That way, when 
>people log off for the day; it's usable CPU time!  The unfortune problem is
>that sometimes the programs crash and burn by themselves and sometimes ye old
>operator does a kill -9 one them.

You may be interested in "condor" from the Univ. of Wisconsin.
A portion of the condor_intro man page:

     Condor is a facility for executing UNIX jobs on a pool of
     cooperating workstations.  Jobs are queued and executed
     remotely on workstations at times when those workstations
     would otherwise be idle.  A transparent checkpointing
     mechanism is provided, and jobs migrate from workstation to
     workstation without user intervention.  When the jobs com-
     plete, users are notified by mail.

Condor may be obtained via anon-ftp from shorty.cs.wisc.edu

Brian J. Hafner
Computer Sciences Department
University of Wisconsin - Madison
hafner at cs.wisc.edu



More information about the Comp.unix.wizards mailing list