How do you find the symbolic links to files.

Bob Goudreau goudreau at larrybud.rtp.dg.com
Sat Dec 8 06:24:41 AEST 1990


In article <6647:Dec619:11:3690 at kramden.acf.nyu.edu>, brnstnd at kramden.acf.nyu.edu (Dan Bernstein) writes:
> 
> > it [st_blocks] tells you nothing about the number and location of
> > holes in the file.
> 
> That's quite correct. In the article you're responding to, I wrote
> ``it can just squish the first N holes it finds, and write explicit
> zeros in the remaining zero-filled blocks.'' One might infer from
> this that there is no way to detect the locations of the holes. So
> what?

First you say that the archiver should perform certain actions on "the
holes it finds", then you admit that "there is no way to detect the
locations of the holes".  So how, pray tell, is it supposed to find
them?  The only portable way is to examine the file data looking for
stretches of nulls; but as I mentioned, this makes your program slower
than it has to be.

 
> > A truly portable method must use only standard functions (such as the
> > ones defined in POSIX.1) and must assume nothing at all about block
> > sizes or any other aspects of the file system structure.
> 
> Well, if a POSIX system doesn't have st_blocks, then obviously a
> portable program can't figure out that a file has holes, so there's no
> point to figuring out how many holes there are. But every POSIX-based
> system I've seen does have st_blocks.

Broaden your horizons a little.  A vast number of UNIX systems in the
world are not BSD-based and do not have st_blocks.  Since POSIX.1 also
does not require it, any software that relies on st_blocks' presence
will be seriously limiting its claims of portability.  But even that's
beside the point; the real issue is that st_blocks alone gives you
very little useful information.  Given a file's st_blocks and st_size
counts, you can't say for certain that the file doesn't have any holes
unless you also have some knowledge of the underlying file system
format and its allocation mechanism.  (Remember that st_blocks also
counts things like indirect blocks and any blocks that may be allocated
past the end of the file.)  And even if you could determine the number
and size of any holes in the file, st_blocks doesn't tell you where
they are, so you still have to examine the file data anyway.  Since
st_blocks doesn't win much for us unless accompanied by other
information acquired by non-portable means, we might as well forget
about portability and have the archiver munge through the file system
structures directly (a la dump(1M)).


> > The obvious
> > way to do this is to have the archiver program read() all <st_size>
> > bytes of the file while keeping an eye out for long stretches of
> > 0-valued bytes so that it can store them in a special space-saving
> > manner in its archive.
> 
> This is only slow on files that do have holes, and then only on long
> stretches of zeros.

Er, yes, that's the point, isn't it?  We're discussing how to make an
archiver that wastes neither time nor tape.


> > Unfortunately, while such
> > an approach is portable, its performance will leave something to be
> > desired on files with truly tremendous holes in them; much time will
> > be wasted on read()ing the holes.
> 
> No, there won't be any read() time wasted. There will be CPU time
> wasted. (Tom points out in another article that vectorization helps
> here.)

Yes, there will be read() time wasted; the archiver must read() the
entire file a chunk at a time and then check each chunk for zeros.
For holes, the read()s shouldn't translate into many actual disk reads
(except for the indirect blocks), but you're still making a lot of
read() calls that would be totally unnecessary if you avoided the holes
entirely.

----------------------------------------------------------------------
Bob Goudreau				+1 919 248 6231
Data General Corporation		goudreau at dg-rtp.dg.com
62 Alexander Drive			...!mcnc!rti!xyzzy!goudreau
Research Triangle Park, NC  27709, USA



More information about the Comp.unix.internals mailing list