sparse files

Chris Torek chris at mimsy.umd.edu
Wed Dec 13 17:45:59 AEST 1989


In article <244 at estinc.UUCP> fnf at estinc.UUCP (Fred Fish) writes:
>The necessary changes to add preservation of sparseness (or creation of
>sparseness from nonsparse files) are fairly trivial and can be probably
>be added to cp, tar, cpio, etc in a matter of a few minutes.  Here is the
>relevant code from BRU (Backup and Restore Utility) with some minor changes
>to simplify variable names:
>
>	if (!allnulls (buffer, nbytes)) {
>	    iobytes = write (fildes, buffer, nbytes);
>	} else {
>	    if (lseek (fildes, nbytes, 1) != -1) {
>		iobytes = nbytes;
>	    } else {
>		bru_message (MSG_SEEK, fname);
>		iobytes = write (fildes, buffer, nbytes);
>	    }
>	}

This code is not sufficient.  In particular, a file that ends in
an `allnulls' block will come out too short.  In older Unix systems,
the following is required:

	while (there are more blocks) {
		read this block, handling any errors;
		if (it is all nulls)
			nullblocks++;
		else {
			if (nullblocks) {
				(void) lseek(fd, nullblocks * blocksize, 1);
				nullblocks = 0;
			}
			write this block, handling any errors;
		}
	}
	if (nullblocks) {
		(void) lseek(fd, nullblocks * blocksize - 1, 1);
		err = write(fd, "", 1) != 1;
		if (err)
			handle error;
	}

On newer systems, the file can be made to end in a hole by using
ftruncate().  If ftruncate() is actually fsetsize() (SunOS and some
other systems), the last section can be replaced by

	if (nullblocks) {
		if (ftruncate(fd, lseek(fd, 0L, 1) + nullblocks*blocksize))
			handle error;
	}

If ftruncate() does what its name claims to do (truncate only), the file
can still be made to end in a hole:

	if (nullblocks) {
		long newoff = lseek(fd, nullblocks * blocksize, 1);
		err = write(fd, "", 1);
		if (err || ftruncate(fd, newoff))
			handle error;
	}

Note, however, that the 4.2BSD and 4.3BSD `restore' programs have the
very same bug that this article is about: if a file ends in a hole,
the restored version of the file will have the trailing hole omitted.
For this reason, the first version---write(fd,"",1)---may be preferable.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris at cs.umd.edu	Path:	uunet!mimsy!chris



More information about the Comp.unix.questions mailing list