problems with "rsh tar"|dd

Mark Bartelt sysmark at physics.utoronto.ca
Fri Mar 29 06:28:55 AEST 1991


For the past several months, I've been backing up some of the disks
on our 4D/280 with a shell script that (except for some irrelevant
command-line-argument goo), is essentially ...

mt -t /dev/exabyte rew
rsh bigsys "cd /d; tar cBaf - $disks" | dd ibs=1k obs=40k of=/dev/nrexabyte
mt -t /dev/exabyte rew

The script is run on a 4D/25, which has an Exabyte drive attached to it.
On the 4D/280 ("bigsys"), the subdirectories of /d are the mount points
for most of the disks.

I've never had a problem until today.  I noticed that three of the disks,
which I had previously been backing up onto separate tapes, had a total
capacity, in aggregate, less than one full exabyte type.  So, I decided
to backup all three onto a single tape.

After a couple of hours, everything hung.  No error message, nothing.
So, I tried a second time; and a third.  It hung each time.  At this
point, I thought "AHA! I've been bitten by the now-infamous you-can't-
write-more-than-2gb-to-a-socket bug" that's been discussed here before.
But, no, when everything just hangs, and I type ^C at to the process at
the PI, dd of course reports how much it's written.  And if you multiply
40k by the number of records, it's nowhere near 2 Gb; less than 1 Gb,
in fact.  The actual amount seems fairly random, but always seems to be
between 700 Mb and 900 Mb.

The other oddity is that after killing the process in the PI, the "tar"
process on "bigsys" is still running.  I'd expect that it ought to get
the socket-world equivalent of SIGPIPE or something, shouldn't it?  And
wierder yet, even a "kill -9" doesn't get rid of them.

Does anyone have a clue as to what the problem might be?

Mark Bartelt                                               416/978-5619
Canadian Institute for                            mark at cita.toronto.edu
Theoretical Astrophysics                          mark at cita.utoronto.ca



More information about the Comp.sys.sgi mailing list