holes in files

Rahul Dhesi dhesi%cirrusl at oliveb.ATC.olivetti.com
Sun Dec 16 07:57:15 AEST 1990


In <1820 at b15.INGR.COM> rob at b15.INGR.COM (Rob Lemley) writes:

>As stated before, when READING a file (ie: via open/read),
>there is NO WAY to determine if a block of zeros constituted an actual hole
>in the file or a disk block full of zeros.

I will make an even stronger statement than that:  There is no
difference between an actual hole and a disk block full of zeroes.
There *is* no difference, even if you can detect a difference.  There
is no difference because both are ways of storing zeros.  An operating
system is perfectly free to store zeroes in some blocks as 0xff bytes
and store 0xff bytes as zeros, so long as it correctly translates
during reads and writes.  You and I have no business asking what's on
disk.  All that we dare ask is whether we read back what we wrote.

We also have no business asking whether each disk block is really
stored with some overhead such as CRC, preamble, postamble, etc., for
the benefit of the read/write hardware.  We have no business asking
whether the block even exists on disk (it might just be in the buffer
cache and not yet written on disk).

Our concern ought to be with data and how fast we can access it, and
how secure it is; not the raw form it's written in.  If we are picky we
can even ask whether our data fits in the space available on disk, and
this is why me might (vaguely) want to be aware that some data storage
schemes (e.g. holes in files) are more efficient than others (e.g. zero
bytes in files).  But for any specific file, at any specific offset in
the file, we should not be asking such this question.

Unless we are writing device drivers, of course.  I don't think we are
in this discussion.
--
Rahul Dhesi <dhesi%cirrusl at oliveb.ATC.olivetti.com>
UUCP:  oliveb!cirrusl!dhesi



More information about the Comp.unix.internals mailing list