Lots of weird hangs reported - could be a 3.51m bug?

Peter H. Schmidt pschmidt at athena.mit.edu
Thu Feb 7 09:04:45 AEST 1991


Recently a spate of postings has appeared detailing several peoples' problems
with processes dying (e.g. smgr), slowing down (wmgr), and machines hanging.
Since my machine has displayed all these symptoms, I am eager to get to the
bottom of this.  To be brief, I think we've hit on a bug in 3.51m.

winter: 3b1, 2M RAM, ICUS 2 disk, disks in external cabinet, WD2010, 3.51m,
floppy tape, periodic 9600baud uucp on tty000, getty on ph1.

My own problem always has the same basic structure: some background program
gets weird, like smgr stopping the processing of cron jobs; then getty's on
tty000 and ph1 become unable to answer the phone; a few minutes later, the
system clock freezes, and at this point all I can do is mouse on the windows -
typing is never echoed, and the SHFT-SUSP hotkey takes ~2 minutes to change
windows.  At this point, I have to hit the reset button, wherupon winter comes
right up and works dandy.  A look at the uucp logs and the cron logs then lets
me reconstruct (in part) how the failure happened.

Unfortunately, this behavior is maddeningly inconsistent.  It is not related
to disk I/O, or power supply voltages.  The programs fail in random order, and
it can happen in as little as 12 hours, or after over a week.  Sometimes, just
for variety, I get a panic, but never the same one.  Before 3.51m, I would get
what now seems like the same behavior, but at intervals of months.  However, I
can't say for sure that the increased frequency started with 3.51m.  It has
only gotten really bad in the past 4 months.

I tried de-dust-bunnying, tweaking the PS voltages upwards, and shutting down
my uucp polling.  Nothing has helped. The diagnostics pass flawlessly (don't
they always?).  I haven't changed the system software since I installed 3.51m.
MeterMaid always shows ample clists and serial buffers, and a decent amount of
free pages.

I wouldn't have wasted this net.bandwidth if it didn't appear from others'
postings that I may not be alone.  Anybody else out there have problems like
this, or ideas on how to fix it?

Regards -- Peter, the Often Rebooting
--
Peter H. Schmidt	| ...mit-eddie!winter!pschmidt
3 Colonial Village, #10	| winter!pschmidt at mit-eddie.mit.edu
Arlington, MA  02174	| -- Speaking for myself.



More information about the Comp.sys.3b1 mailing list