Reliability v. Fire Risk (original v9n21)

Pete Mellor pm at cs.city.ac.uk
Sun Feb 18 13:32:53 AEST 1990


Scott Stone asks (v9n21):

> One of the companies I work with is considering turning their Suns off
> every evening, and on again in the morning.  They wish to do this in order
> to minimize the possibility of a fire...What opinion do you have about this?

The gain in safety from switching off is probably outweighed by the
inconvenience caused. This is a guess: I don't have any statistics on
fires in computing machinery, and I don't know who has. SUN would probably
be happy to tell you how safe their machines are provided you don't fool
around with 3rd party memory upgrades with which the fan can't cope (or,
worse, install your own fan, as was suggested recently in SUN-SPOTS).
Other than that, try asking a few insurance companies how they assess the
risk for computer installations.

> What percentage of people with networked Suns leave them on...?

In my limited experience, 100%. Our centre (4 machines) certainly does,
and as far as I know, so does every other department in the university
(and that's quite a few SUNs), and so do at least one large firm and two
departments in other universities with whom we work. I did see one guy
turn off the monitor alone to save the phosphor, but he'd forgotten how to
run screenblank.

The only times we switch off our network is when we have been warned by
the electricians of a scheduled loss of power during maintenance work. I
did switch off my own machine at night for a time when the fan was making
a funny noise.  I had visions of the fan packing up completely and the
machine overheating while nobody was there to spot it.

> Would turning it off every night, and on in the morning reduce the
> reliability/MTBF of the machine significantly?  

Probably yes. I don't have any data on this myself, but an ex-colleague in
the quality, reliability and statistics department of a large computer
manufacturer has been investigating the effect of various 'explanatory
variables' on the reliability of printed circuit boards. He did tell me
that a 'duty cycle' involving regular power-off showed a significant
positive correlation with PCB failure rate. (I don't know if these results
have been published.)

My understanding has always been that SUNs are designed to be permanently
powered on.

> Have you known anyone that has had a fire due to a computer, particularly
> a Sun?

See v9n20! The only serious fire in a computer installation with which I
was connected was caused by an operator on the night shift dropping a
lighted cigarette into a waste-bin.

> What pro's/con's do you see?

Each of our machines has its own hard disc, and each disc is remotely
mounted on every other machine via nfs. To bring back up more than one
machine in the network involves the dreaded 'nfs: server not responding'
deadlock. Also, if any machine is off, everyone else on the network is
deprived of that machine's filestore. Add to this the fact that here we
work what is politely described as flexi-time (i.e. you've no idea when
any particular user will be sleeping off a hangover until lunchtime, or
working until 3 in the morning to make up for it), and that our central
mail server would probably not like it if it found a machine off-line when
trying to distribute overnight e-mail, and you will see that we have no
choice but to leave everything switched on.

Regularly powering down a network can *only* work if everyone works from 9
to 5 and e-mail is suitably stored until power-on time.

On the other hand, I wonder how much of the earth's resources are spent in
driving machines which spend around 75% of their time waiting for another
machine to talk to them? What is the green party's policy on this?  It was
with this thought in mind that I used to switch of my old ICL PERQ every
night, but that was a stand-alone machine. (It also required 2 new hard
discs in 12 months!)

I hope that this is a fair assessment, and that I don't get flamed
(metaphorically) by a lot of people who have been flamed (literally) by
SUNs!  In CSR, we're more interested in software reliability than in
boring things like the probability of the centre going up in smoke one
night. If anyone out there has any relevant data (statistical or
anecdotal), I'd be very interested to talk to you.



More information about the Comp.sys.sun mailing list