holes in files

Jim Balter jim at segue.segue.com
Thu Dec 27 04:40:28 AEST 1990


In article <11749 at alice.att.com> andrew at alice.att.com (Andrew Hume) writes:
>In article <2809 at cirrusl.UUCP>, dhesi%cirrusl at oliveb.ATC.olivetti.com (Rahul Dhesi) writes:
>actually, from what this thread has
>uncovered, it might be safer to write non-zero data to avoid
>smart filesystems. what scares me more are hyperintelligent
>disk drives that have built in data compression and might be able
>to take 20 blocks of some values but not be able to overwrite them
>because of different compression rates.

Obviously, the worst thing you can do is write zeros.  Write random data.
Better than using a random number generator on the fly is to precompute a block
of data that looks like noise (there are various statistical measures for
randomness (lack of signal)).  While this isn't guaranteed to defeat all
compression schemes, it greatly reduces the likelihood of too few blocks being
allocated.  When the odds of that happening are on a par with the odds that a
plane will crash through the roof and destroy the disk drive, you can sleep
better at night.  Also, if you are writing critical real time applications,
your hardware and OS are significant parts of the system and should be
carefully specified so that they do not violate your requirements.


Some people seem to think, though, that it is better to have inefficient disk
drives or archivers to prevent breaking such programs as yours.

st_blocks is so that programs (e.g., du, ls) can determine actual disk usage.
It isn't for any other purpose, it is silly to try to imagine such purposes,
and it is foolish, if you can think of such a purpose, to implement it.
I'm sure that, if the designer had thought of it and had thought it necessary,
s/he would have added something like "st_blocks is not a permanent attribute
of a file; it may, for instance, change if a file is archived or is treated
by a disk compacter." to the documentation.  Pretend that this was said.

Programs that read the disk directly are bypassing file logical structure
and have no right to make any kind of assumption about the persistence of
file attributes.

As a general principle, people and programs care about files with holes
turning into out of space conditions, but conversely they have no reason to
object if out of space conditions turn into files with holes.

Archivers that restore with holes are doing it right.  They acknowledge that
disk space matters (welcome to the real world) and that the presence of holes
is invisible within UNIX file semantics except for st_blocks, which is a
report value  and not a permanent or persistent attribute of a file
(welcome to conceptual clarity).



More information about the Comp.unix.internals mailing list