SVR2/SunOS 3.5 cpio(1) a major disaster + some fixes
Ian Donaldson
rcodi at yabbie.rmit.oz
Tue Aug 2 19:33:45 AEST 1988
Description:
cpio sometimes silently refuses to link destination files back
together. This happens with the -p and -i options, and the
-o option can generate an incorrect archive.
At the end of this report is a list of other problems with cpio,
some major, some minor.
Versions:
The cpio supplied with SunOS 3.5 exhibits the fault. The cpio
supplied with the SVR2 Vax distribution also exhibits the fault.
Repeat by:
On BSD fast-filesystems, this is easy; go to a filesystem that has more
than 32768 inodes (practically any large one), and do this:
rm -fr a b c
cat </dev/null > a
ln a b
mkdir c
ls a b | cpio -pdmv c
ls -lg c
This -may- fail if "a" has an inode number > 32767.
What will happen is in the directory "c" you end up with two different
files "a" and "b" that are not linked together, but have the
same contents. cpio also doesn't indicate that it had linked them.
(because it didn't)
On a System-V filesystem, you need to find a filesystem that has
more than 32767 inodes currently allocated (not as common), and
do the same thing.
This fails more on BSD because of the way the inodes are allocated;
higher i-numbers are commonplace. On the System-V filesystem,
inodes are allocated at the lower end of the scale.
Fix:
Not trivial. Best fix is to "rm /usr/bin/cpio" :-)
A partial fix that will work with high probability is to change
the type of "m_ino" from "short" to "ino_t" in struct "ml", routine
postml(). This allows inodes upto number 65535 to be handled 100%,
and above that you will get unpredictable failures, when two
multiply-linked inodes that have the lower-16 bits of their
inode numbers the same end up being linked when they shouldn't be.
(I strongly suspect that SVR3 cpio has the above partial fix)
A proper fix is more difficult because the cpio(5) format only
has a 16-bit field for the inode-number, whereas BSD systems
(at least) have 32-bit inode numbers.
My workaround is to use the cpio(5) inode-field only when a file has
multiple links (ie: st_nlink > 1), and generate a unique number
bearing no relationship to the original inode number. If the
file has a single link or is a directory, the cpio(5) inode
number field would be zero.
Since cpio only uses the inode-number field when nlinks > 1, this
should not pose a problem, and gives a maximum of 65534 multiply-linked
files per archive, a limit that probably won't be hit quickly.
(if there is a cpio that uses the inode-number field all the time,
then this would be broken, but I cannot see why it should do so)
(I strongly suspect that SVR3 cpio doesn't have this fix, since
the inode numbers in the archive seem to match the originals
on singly-linked files)
The SVR2 cpio source is a complete disaster, and should be totally
rewritten from scratch (if it hasn't already been done in SVR3).
There are more bugs in it than you can poke a stick at!
Other-bugs:
Among other things wrong with -this- cpio are:
(I'm not necessarily speaking of the SunOS cpio here, but I know some
of these bugs do apply to it)
- disk errors could easily result in an archive being corrupted
because the file-size is written in the header based
on what stat(3) says, but the copy routine gives up copying
the file when it comes across a bad read from disk;
and doesn't pad the file out on the archive to its
stated size. This means that the header for the file
is incorrect, and upon reading will get cpio very
confused.
- somebody truncating a file that is being dumped (eg: via
creat() or truncate()) could result in a corrupt archive,
making cpio absolutely useless for dumping live filesystems
(I suspect this might have been fixed in SVR3)
- cpio will restore the modification time of a destination
file during '-p' or '-i' commands, even if the file was
not successfully restored. This makes it damn difficult
to redo the restore/copy and find out which files weren't
restored (eg: filesystem full, you end up with lotsa
inodes of zero length with the original mod-dates!).
- having errno in an error message is absurd
(this seems to have been eradicated in SVR3, and in
SunOS 3.5)
- many arrays can be over indexed because of lack of
bounds checking (eg: you supply more than 100 patterns
on the command line, or a file name that exceeds 256
characters). Both of these situations could cause
a coredump.
- a limitation of the number of reels of tape is imposed due
to a bug in that /dev/tty is reopened but not closed each
time input is requested when changing reels. On many
systems, this limit is around 16, if NOFILE=20.
- there are calls to utime(2) using the stat structure
directly, which is incorrect on many (eg: BSD) systems,
resulting in the inode mtime being reset to the epoch
if the -a option is used. This is because the stat structure
is different.
This bug isn't present in SunOS 3.5 cpio.
- attempting to dump files with inode numbers > 32767 or
NFS inodes will result in a corrupt archive if using -c,
due to sign extension in bintochar() causing the octal
numbers to be written with wider width.
(NFS inodes have dev being negative; i-numbers > 32767
are negative).
This bug isn't present in SunOS cpio.
- writing an archive with files owned by "nobody" (uid = -2)
will create a corrupt archive when using -c.
This bug IS present in SunOS 3.5 cpio.
- opening of /dev/tty isn't even checked for success, resulting
in a probable coredump if you run it from cron and multi-reels
are required. (SVR3 cpio has options to specify a
substitute for /dev/tty anyway, which would allow it to
run from cron ok, and I strongly suspect that this
has been fixed in SunOS 3.5, by the looks of the error
messages in the binary)
- trying to restore files from an archive that are within
read-only directories in the archive are impossible unless
you don't restore the directories. This is because the
mode of the directory is set before the files are extracted.
This bug IS present in SunOS 3.5 cpio.
(could be tricky to fix, unless you keep a list of
directories and set all the modes for them at the end
of the run; or chmod the directories if you need
to write into them. Tar typically has this problem too,
when you use its 'p' option)
All in all, cpio is a total mess and should be totally rewritten or replaced.
Beware if you use this as your only backup mechanism.
I would be interested in anybody with the "latest" cpio (SVR3?) could
peek at the source and comment on which of the above bugs remain.
(I don't have access to SVR3 sources)
Ian D
More information about the Comp.bugs.4bsd.ucb-fixes
mailing list