How do you find the symbolic links to files.

Dan Bernstein brnstnd at kramden.acf.nyu.edu
Mon Dec 10 12:26:42 AEST 1990


In article <1990Dec7.192441.24778 at dg-rtp.dg.com> goudreau at larrybud.rtp.dg.com (Bob Goudreau) writes:
> In article <6647:Dec619:11:3690 at kramden.acf.nyu.edu>, brnstnd at kramden.acf.nyu.edu (Dan Bernstein) writes:
> > > it [st_blocks] tells you nothing about the number and location of
> > > holes in the file.
> > That's quite correct. In the article you're responding to, I wrote
> > ``it can just squish the first N holes it finds, and write explicit
> > zeros in the remaining zero-filled blocks.'' One might infer from
> > this that there is no way to detect the locations of the holes. So
> > what?
> First you say that the archiver should perform certain actions on "the
> holes it finds", then you admit that "there is no way to detect the
> locations of the holes".  So how, pray tell, is it supposed to find
> them?

Sorry. What I meant was that the archiver can just squish the first N
zero-filled blocks it finds into holes. Then it writes zeros into the
remaining zero-filled blocks.

This suffices to restore st_blocks. It may not restore the locations of
the holes, but there is no (portable BSD) way to detect those locations,
so this isn't a problem.

Do you understand this? The point is to restore as much information as
possible. st_blocks is perfectly portable within BSD, so a BSD archiver
should make every effort to restore it. On the other hand, there is no
portable way within BSD to locate where the holes are, so the archiver
does not need to restore the holes into their original spots. It just
has to get the right number of them.

> The only portable way is to examine the file data looking for
> stretches of nulls; but as I mentioned, this makes your program slower
> than it has to be.

Yes, it makes it slower. It does not make it significantly slower.

> > > A truly portable method must use only standard functions (such as the
> > > ones defined in POSIX.1) and must assume nothing at all about block
> > > sizes or any other aspects of the file system structure.
> > Well, if a POSIX system doesn't have st_blocks, then obviously a
> > portable program can't figure out that a file has holes, so there's no
> > point to figuring out how many holes there are. But every POSIX-based
> > system I've seen does have st_blocks.
> Broaden your horizons a little.  A vast number of UNIX systems in the
> world are not BSD-based and do not have st_blocks.

I'm aware of that. I just haven't seen a POSIX system that doesn't have
a BSD-derived filesystem, where struct stat includes st_blocks. And
(once again---no offense, but I feel like I'm talking to a wall) it is
only important to restore the same number of holes IF there is a way for
a program, portable within the environment in question, to figure out
the number of holes. It is not important to restore the number of holes
if there is no equivalent to st_blocks. (This is what I said starting
with ``Well'' above.) In other words, I am talking about a problem
specific to a certain environment, so why can't I talk about portability
within that environment?

> Since POSIX.1 also
> does not require it, any software that relies on st_blocks' presence
> will be seriously limiting its claims of portability.

Once you use some BSD features, you've already limited your claims of
portability. What's wrong with also taking advantage of st_blocks?

> > This is only slow on files that do have holes, and then only on long
> > stretches of zeros.
> Er, yes, that's the point, isn't it?  We're discussing how to make an
> archiver that wastes neither time nor tape.

Er, yes, but sometimes you have to pay time for space. As I said before,
it would be better to have full information about the locations of
holes, but we have to work within the information provided by current
systems.

> > > Unfortunately, while such
> > > an approach is portable, its performance will leave something to be
> > > desired on files with truly tremendous holes in them; much time will
> > > be wasted on read()ing the holes.
> > No, there won't be any read() time wasted. There will be CPU time
> > wasted. (Tom points out in another article that vectorization helps
> > here.)
> Yes, there will be read() time wasted; the archiver must read() the
> entire file a chunk at a time and then check each chunk for zeros.

It has to read the entire file anyway, if it is going to write() it onto
tape. Where are your extra read()s?

---Dan



More information about the Comp.unix.internals mailing list