Wanted: Faster Dump/Backup Procedures for bsd4.2

Chris Torek chris at umcp-cs.UUCP
Sun Jun 9 12:28:11 AEST 1985


net.news.sa is not the right place for this, so I've stuck a Followup-To
header in...

> Taking a full dump is a pain in the neck.  Our VAX/750 has 2 Eagle drives
> and one 80 Mbyte removable drive.  It takes literally a whole day to take
> a full dump of the entire system. [...]  I guess I could shut down the
> system, but I would rather do it otherwise considering that I have to
> shut down the system for a few hours.

Running dump on active file systems is not a good idea.  Dump scans the
entire file system once first to see what to do, then assumes that nothing
changes as it goes along.  Locally, we compromise by doing full backups
with the machine in single user mode and incrementals with the system
active.  At least we'll never have to go back more than two weeks....

> Has anyone out there with a much faster 'dump' programs, or a faster
> way to do 'dump'?  I can see the existing 'dump' can be improved
> significantly since the Cipher Tape is not streaming and most of the
> time, the VAX is idle waiting for I/O.

Don Speck (at CalTech) and I had this one out fairly recently.  He's
got some changes to /etc/dump that make it use N processes; this does a
good job of keeping the disk and tape drives active.  I solved the same
problem a different way, by sticking a hack in the kernel that does
pseudo-asynchronous I/O on character devices.  (I call my hack the
``mass driver''.  I think I'll bring something on it to Portland.)

With my changes, we have cut the time to do backups literally in half
on the big 780s, and down to a third on the 750s.  (The 750s have
streamers---TU80s---which can run 4 times as fast in streaming mode;
that's what gives them their edge.)  We used to take umcp-cs down on
Wednesdays at 7:30 and have it back up by 1:30 or so; now it's back up
around 10:30 (when all goes well; occasionally one of the two tape
drives craps out).

I've also used the mass driver to make a version of ``dd'' that runs
much faster; we use this for making distribution tapes.  (Makes quite
a difference to have 20-level buffering when the load is around 7....)

For those who missed it the first time, I still have a copy of the
original mass driver distribution kit.  It's available via anonymous
FTP from host MARYLAND.ARPA (grab the file mass_driver).

By the way, I discovered (quite by accident) that after increasing
MAXBSIZE, stdio sometimes breaks on /dev/null, because there are two
fixed-size buffers (_sibuf and _sobuf) which are used for stdin and
stdout, and because the stat system call puts MAXBSIZE in the
st_blksize field.  Recompiling the C library fixes that.  (I did
consider the MAXBSIZE increase, but decided that it wouldn't hurt since
no one was going to go make 16k/2k file systems on their disks, so I
figured st_blksize would always be <= 8K.)  [I think Berkeley should do
what Sun did: remove _sobuf and _sibuf from stdio and just use malloc.]

Also by the way, Don Speck's code is also available from him (I
believe).  It takes a lot less work to install.  I don't know for sure
how it compares to my mass driver for performance.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris at umcp-cs		ARPA:	chris at maryland



More information about the Comp.unix.wizards mailing list