tar or cpio, which is better?

Jay Ts jay at metran.UUCP
Sat Nov 17 17:56:22 AEST 1990


In article <1990Nov15.192615.1238 at hemel.bull.co.uk>, mpreen at hemel.bull.co.uk (Malcolm Preen) writes:
> sparks at power.viewlogic.com (Alan Sparks) writes:
> 
> >In article <1990Nov12.095657.22489 at erbe.se> prc at erbe.se (Robert Claeson) writes:
> >>Using cpio instead worked just fine. Also, for backup purposes,
> >>cpio is probably the best. It comes *standard* with the ability to
> >>detect end-of-tape and create multi-volume archives. It has better
> >>support for incremental backups and selective restores. And it supports
> >>longer paths than tar's limit of 100 characters.

I also prefer cpio, but none of these (IMO) are true for Xenix 2.2 and 2.3
systems.  On Xenix, cpio's pathnames are limited to 128 chars, and if there
are "too many [I love that explicit number :-) ] unique linked files, the
program runs out of memory to keep track of them and thereafter linking
information is lost."  Maybe even with error messages :-)

> 
> >Which version of cpio are you running?  The cpio man page on the
> >Sun here says, "cpio does not support multiple volume tapes."
> >(unquote).
> 
> On our system, Bull DPX/2-300 running a mix of system V and BSD the man page
> for cpio says :
> [discussion of -O and -I options supported on their system]

Oh dear, all this brings a up a few more topics concerning cpio (and tar).

I have been trying to write a "simple" shell script to give to my clients
(mostly, well, *all* small company office workers) so they can easily back
up their systems without knowing how to use the shell.  They just select
"Filesystem Backup" from their menu and follow the instructions.

The problem is that I want the script system to run *unmodified* on any UNIX
system, be it Xenix System V 286, 386; ISC 2.0, 2.2; ESIX or whatever.  So far,
tar and cpio seem to work differently on each system!  I really prefer to
use cpio due to the portability and versatility issues, but on Xenix, for
example, the porting company just decided to support tar better than cpio.
I guess there is a preference for tar at SCO; for both Xenix 286 2.2 and
Xenix 386 2.3, cpio gets 2 manual pages and does not support *any* of the
-I, -O or -M options.  In addition, there is no mention of whether or not
it can detect EOT and support multiple volumes.  On these same systems, tar
gets 5 manual pages, along with lots of options and support for multiple
volumes (i.e., floppy disks.  You specify the size of the "tape" on the
command line, so EOT detection is not necessary.).

On ISC and ESIX (both very close to AT&T Sys V/386), cpio and tar get nearly
equal treatment.  They are both documented as supporting multiple volumes.
cpio is much more versatile, but due to lacking documentation, it is not
clear exactly what cpio's behaviour will be at EOT unless certain options
are used.  For example, the -I -r -O option must be used with the -M option,
but this is not clearly stated.  (... I hope I am remembering this correctly
...)  For -M, the manual says "[with -O and -I] ... you can use this option to
define the message that is printed when you [sic.] reach the end of the
medium.  Well, I reached the end of *my* medium while trying to use the
-M option as in 

	cpio -ocavB -M "insert disk %d: " >/dev/tape'
	
!!!  The manual page does not say that -M will *not* work *without* -I or -O.
In developing my "portable" script, I was trying momentarily to not use -I
or -O to be compatible with SCO's cpio.  Fortunately, I discovered this
while using floppies; the resulting behaviour was that cpio reached the
end of the disk, printed no message, and quietly flipped back to track 0
to continue with the "backup", (overwriting what was previously written, I
guess) so it was immediately obvious to me that something was wrong.  (3 Mb
of files do not fit on a 1.2Mb floppy, do they?)

This brings up another issue.  Just because your documentation says cpio/tar
will support multiple volumes, do you KNOW they really do?  It's really
up to your vendor's port and whomever's device driver(s) you are using.
I commonly work on systems with, say, 40 Mb of disk usage and 60 Mb tapes.
If my client's disk usage grows, I don't really know if there will be a
message informing them that the backup will not fit on one tape.  (It works
on floppies, but what about the 3rd party QIC cartridges?)  For the
moment (and maybe forever) I do a listing of the tape, and compare it to
the list I fed cpio, as in

	cd /
	find . -print >cpio.in
	[edit to eliminate the "./" at the start of each line of cpio.in]
	cat cpio.in | cpio -ocavB >/dev/tape
	cpio -ictB </dev/tape >cpio.out
	diff cpio.in cpio.out >diff.out
	if [ -s diff.out ]
	then
		print error message, ending in
		Call Jay Ts for help!
	else
		print success message
	fi

There are also incompatibilities between ISC and SCO concerning what output
is considered "error output" for the verbose listing, if I remember correctly
from experience.  I wanted to include an example here to illustrate, but don't
remember the problem exactly, and reading the documentation just now got me
very confused.  It is not explicitly defined.  The funny (?) part of this is
that all of these (Xenix vs. ISC/ESIX) are sold (more or less) as "System V".
I'm not even touching upon Berkeley/SunOS or whatever.

And what about the "portable" script I was writing?  Well, I gave up on that
idea!  I am now writing custom scripts (from scratch, each time) for each
of my clients.  It is actually a lot easier to support 10 fully-customized
scripts than one "portable" one which is mostly "if then" statements, shell
variable defines and other cryptic stuff that are only there to support nine
other systems.

These are clearly issues for the standards bodies.  Until then, my best advice
is to decide for yourself which of cpio and tar work best for you and your
system, and try to be consistent!  If another S.A. asks for something on tape,
ask him/her which format to use, and clearly mark on any tape you write what
the format is, maybe also providing a suggestion such as:

	"to restore: cd [dirname]; cpio -icdvmkB </dev/tape"

(NOTE: If you're making a tape for me, please, if possible, use cpio with the
-c option!)

Also, I just want to mention that I feel that there are serious deficiencies
in both cpio and tar, and there is work to be done to bring them even up to
the quality of common MS-DOS backup utilities!  I hate MS-DOS just as much
as the rest of you, but when talking with my associates who use MS-DOS, I
have to yield to them on this one.  The major problems (as I see them):

	1. Lack of verification - I can do a listing of a cpio tape,
	   but that only checks the headers, not the files' contents.
	   Both cpio and tar seem to have been implemented with the
	   assumption that magnetic media are perfect!  This should
	   be listed in the BUGS section.  Really!  Does AT&T know
	   that we are using these programs for backups? :-)

	2. Speed - The UNIX-variants I've used are incredibly slow
	   writing to floppies with cpio.  I know UNIX is inherently
	   slower due to double-copying of buffers between kernel and
	   user space, but that just does not explain the difference!
	   (If cpio is so slow because it is actually verifying writes,
	   it's sure not documented.  It also would make me wonder why
	   I get so many bad archives on Verbatim Teflon-coated, pre-
	   formatted floppies!)
	
	3. User interface - need I say more???  You know as well as I
	   do (I hope) what happens if you are on the 21st of a 23 floppy
	   cpio, you get prompted "Insert disk 22: ", and you hit Enter
	   *twice* by mistake.  Errrrr!!!  (Also, be forewarned that on
	   Sys V/386, doing a backup through the sysadm menus will store
	   the files on tape with ABSOLUTE PATHNAMES.  Need I say more?)

Well, this posting turned out to be quite long!
I hope I've come close to putting an end to this issue, rather than causing
more confusion (or starting another thread!).  If not, I apologize in
advance...

				Jay Ts, Director
				Metran Technology
				uunet!pdn!tscs!metran!jay



More information about the Comp.unix.admin mailing list