_slow_ rdump

Ian! D. Allen [CGL] idallen at watcgl.waterloo.edu
Mon Oct 15 13:43:37 AEST 1990


Here's stuff on dump/rdump I sent to comp.unix.ultrix last summer.
We run Ultrix 3.1 and 3.1C.

>From idallen Thu Jun  7 21:42:39 1990
To: comp.unix.ultrix
Subject: Why isn't dump maximally efficient with TK70 tapes?

DECsystem 5400, Ultrix 3.1C, RA90 disk, one user (me).

Watch the elapsed real times here.

Here's a plain root dump to tape (TK70):

    # time dump 0 /
      DUMP: Date of this level 0 dump: Thu Jun  7 21:17:43 1990
      DUMP: Date of last level 0 dump: the epoch
      DUMP: Dumping /dev/rra0a (/) to /dev/rmt0h
      DUMP: Mapping (Pass I) [regular files]
      DUMP: Mapping (Pass II) [directories]
      DUMP: Estimates based on 1200 feet of tape at a density of 10240 BPI...
      DUMP: This dump will occupy 1103 (10240 byte) blocks on 0.13 tape(s).
      DUMP: Dumping (Pass III) [directories]
      DUMP: Dumping (Pass IV) [regular files]
      DUMP: 57.43% done, finished in 0:03
      DUMP: 1103 tape blocks were dumped on 1 tape(s)
      DUMP: Tape rewinding
      DUMP: Dump is done
    0% real=9:29 usr=0.3 sys=1.9 rd=0 wr=4 mem=56 pg=3 rec=17 sw=0 sig=0 cs=2776

Here's the identical root dump piped to dd to tape:

    recorder# mt rew
    recorder# time sh -c "dump 0f - / | dd bs=32k rbuf=2 wbuf=2 of=/dev/rmt0h"
      DUMP: Date of this level 0 dump: Thu Jun  7 21:28:18 1990
      DUMP: Date of last level 0 dump: the epoch
      DUMP: Dumping /dev/rra0a (/) to standard output
      DUMP: Mapping (Pass I) [regular files]
      DUMP: Mapping (Pass II) [directories]
      DUMP: Estimated 11295744 bytes output to Standard Output
      DUMP: Dumping (Pass III) [directories]
      DUMP: Dumping (Pass IV) [regular files]
      DUMP: 11295744 bytes were dumped to Standard Output
      DUMP: Dump is done
    0+2780 records in
    0+2780 records out
    4% real=3:34 usr=0.7 sys=8.7 rd=1 wr=8 mem=37 pg=2 rec=17 sw=0 sig=0 cs=10111

That's almost three times faster!  Why can't dump be as good as dd?
Dumps are of major importance; I would have thought that dump would be
the most clever user of the tape drive.  I can't believe this.  Am I
missing something?  I must be missing something.

>From idallen Fri Jun  8 02:46:42 1990
Subject: Fun with dump

Ultrix dump of root to nowhere:

    bandicoot# time dump 0f - / >/dev/null
    [dump stuff deleted]
    16% real=0:22 usr=0.9 sys=2.7 rd=8 wr=4 mem=332 pg=0 rec=0 sw=0
	sig=0 cs=959

Ultrix rdump of root to nowhere:

    bandicoot# time /bin/rdump -0f bandicoot:/dev/null /
    [dump stuff deleted]
    39% real=0:55 usr=2.8 sys=19.1 rd=2 wr=6 mem=282 pg=3 rec=60 sw=0
	sig=0 cs=4533

Ultrix rdump of root to a real tape:

    bandicoot# time rdump -0f recorder:/dev/nrmt0h /
    [dump stuff deleted]
    [I hit break after 6 minutes when dump estimated the dump
     would take another 20 minutes]

Ultrix dump of root to rsh/dd to a tape:

    bandicoot# time dump 0f - / | rsh rec dd bs=32k rbuf=2 wbuf=2 of=/dev/rmt0h
    [dump stuff deleted]
    7% real=1:31 usr=1.0 sys=6.0 rd=2 wr=4 mem=351 pg=0 rec=3 sw=0
	sig=0 cs=4900
    15% real=2:48 usr=2.5 sys=24.1 rd=15 wr=7 mem=206 pg=0 rec=3
	sw=0 sig=0 cs=10300

What I learned:

    Don't use rdump.  It's an order of magnitude slower than a pipe to dd.
    In fact, even dump is slower than dump to stdout piped into dd with
    wbuf=2, because of bugs in the Ultrix nbuf code.  At least Ultrix dd
    handles multiple tapes and multi-buffer writes; isn't that convenient?

>From idallen Fri Jun  8 04:19:02 1990
Subject: More fun with dump on Ultrix

You'd think that the dump command would have the smarts in it to
write tapes efficiently.  Wrong.  I wrote a simple program that reads
stdin, builds a 32K buffer, and writes it out using Ultrix
double-buffer I/O.  I used it on 198525952 bytes of /usr file system
on our DS5400, sent to a TK70 295Mb tape cartridge:

    # time sh -c "dump 0f - /usr | ./a.out >/dev/rmt0h"
    [dump info deleted]
    5% real=43:01 usr=11.3 sys=131.8 rd=4 wr=8 mem=63 pg=2 rec=18 sw=0
	sig=0 cs=138864

43 minutes elapsed time.  Compare that with what the default gets you:

    # dump 0 /usr
    [dump info deleted]
    DUMP: Estimates based on 1200 feet of tape at a density of 10240 BPI...
    DUMP: This dump will occupy 19400 (10240 byte) blocks on 2.29 tape(s).

Woops.  This dump won't even fit on the tape using the defaults.
Even if I kludged the tape size to make it seem to fit, it would still
take 3 *hours* to dump.  Ultrix dump also uses double-buffer I/O, but
it specifies 8 buffers instead of just 2.  The software release notes
for Ultrix 3.1C suggest 2 is better than more than 2, and this sure
bears that out.

>From idallen Fri Jun  8 17:08:17 1990
To: comp.unix.ultrix
Subject: More undocumented performance issues with dump

Dump to stdout (a tape):

    # time dump 0bf 32k - / >/dev/nrmt0h
      DUMP: 11318272 bytes were dumped to Standard Output
      DUMP: Dump is done
    0% real=15:34 usr=0.3 sys=1.6 rd=0 wr=4 mem=92 pg=2 rec=18 sw=0 sig=0 cs=2058

Dump to the same tape directly:

    # time dump 0bf 32k /dev/nrmt0h /
      DUMP: 345 tape blocks were dumped on 1 tape(s)
      DUMP: Dump is done
    0% real=7:16 usr=0.4 sys=1.6 rd=0 wr=4 mem=94 pg=2 rec=18 sw=0 sig=0 cs=2019

Ultrix dump assumes that any output to stdout is to a pipe; it doesn't do
the same stat() [fstat()] tests to determine device type that it does
when you give the file name on the command line.

>From idallen Fri Jun  8 17:16:46 1990
To: jpe at egr.duke.edu
Subject: Re: Why isn't dump maximally efficient with TK70 tapes?

> Problem #2 -- according to my man pages for "dd" the rbuf and wbuf options
> cannot be used at the same time.  Besides, the default wbuf value is 8
> for devices that support it.

Indeed, you, the source, and the man page are correct.  The example in
section 1.1.13 of the Ultrix 3.1C release notes is wrong, and I copied
it.  Silly me -- I thought the release notes knew something the man
page did not.  The first option wins and over-rides following [rw]buf
values.  What is not wrong is the statement in 1.1.13 "to get the
most performance gain, use a value of 2 with rbuf and wbuf options".
The default 8 buffers cause *worse* performance that specifying 2.
 
> Problem #3 -- The block size you specified to "dd" was wrong.  Dump writes
> in 10k blocks, not 32k.  Also you need to specify the obs (instead of bs)
> and specify a cbs equal to the obs.  This will buffer the input to the
> output block size, then write it to tape.  Restore will read a tape
> created this way, I doubt if it can read yours.

No, I wanted to write 32k blocks; it's faster and more efficient. I write
dump tapes far more than I read them; I wanted to speed up the writing.
Restore reads such tapes just fine if unblocked first:

     # dd if=/dev/rmt0h bs=32k rbuf=2 | restore -if -

You're right about the failure to buffer up to the output buffer
size, but I don't want to pay the price of using dd to make the
conversion -- it's way too slow.  See my latest note in comp.unix.ultrix
about a little program that buffers up to 32k and writes.
 
> Problem #4 -- You should note instead the system times and percentage of
> CPU used.  On my VAXserver 3600 these times jumped dramatically in order
> to give me a few seconds real-time savings.  Also when I used a no-rewind
> device "dump" was actually faster than the "dd pipe."  On a CPU-loaded
> system you might not have such a big win..

I observed a factor of three in real-time performance; more than a few
seconds and of importance to us.  The tape rewind added 12 seconds to
any times I posted.

Looking at the source to dd, I see that if one doesn't use "bs=", it copies
the data painfully from the input buffer to the output buffer one byte at
a time.  No wonder that eats cpu.  I wrote a simple program to buffer
input and write it out in 32k chunks; this works much better than dd, but
it won't handle multiple volumes.  See my comp.unix.ultrix posting.
 
> Problem #5 -- What happens if one of your partitions becomes larger than
> a TK70?

Ultrix dd handles multi-volumes.

I think the problem is just that dump uses too many buffers and in Ultrix
3.1C that makes things worse rather than better.  Or perhaps dump's
buffers aren't aligned on page boundaries, and dd's are.  (See Ultrix
Version 3.1C Release Notes section 1.1.12.)
-- 
-IAN! (Ian! D. Allen) idallen at watcgl.uwaterloo.ca idallen at watcgl.waterloo.edu
 [129.97.128.64]  Computer Graphics Lab/University of Waterloo/Ontario/Canada



More information about the Comp.unix.ultrix mailing list