Net news problems

wood at cascade wood at cascade
Thu Apr 17 13:14:37 AEST 1986


From: Ernest Wood <wood at cascade>

FYI:  Perhaps Brian and the experts out there are aware of this but I thought
I'd post it anyway.

The problem with news from Glacier right now is that the batch method
from Glacier proceeds on a host by host basis with each host getting all of
its batch files before the next host in line gets any.  There is a bug
somewhere with rcp (the method used to transfer the files) such that, in some
cases, it doesn't timeout but at the same time won't complete the transfer
to a particular host.  If it would timeout the shell script performing the
transfers would then proceed to the next host and ignore the remaining files
destined to the apparently down machine.  Since it doesn't but won't
complete the script remains hung until the the next hourly invocation
of the shell script.  At this time the hung script (but NOT the rcp) is killed
and a new one started.  This one transfers everything it can until it too
hangs on the funny host.  And so it goes.

Unfortunately navajo appears to be such a funny host and is also in
about the middle of the list.  The list is alphabetically sorted by the
script or it would be possible to remove part of the problem by putting it
last.  Of course the other side of the problem is that glacier slowly
accumulates a number of supposedly dead rcp's to navajo.


For sometime I've noticed that this was happening to navajo but I always
assumed this was coincidence.  Apparently it isn't.  WHY this is happening
is another question altogether.

	I just find um, I don't explain um,
		-ernie



More information about the Comp.unix.wizards mailing list