Bash, tar, and broken pipe

heinz at cc.univie.ac.at heinz at cc.univie.ac.at
Wed May 22 22:51:43 AEST 1991


In <16345 at helios.TAMU.EDU> byron at archone.tamu.edu (Byron Rakitzis) writes:

>Heinz (heinz at cc.univie.ac.at) sent me some personal mail which I could
>not reply to (is there another address I could use to get mail to you,
>Heinz?). However, he raised an interesting point:

Try one of the following: heinz at sophie.pri.univie.ac.at (<-- preferred)
			hh at eacpc1.tuwien.ac.at
			A4424GAF at AWIUNI11.BITNET
			herbeck at rice.edu

>Given a pipeline

>	foo | tar ft -

>it seems clear that tar must read to EOF in order to determine whether
>the tar file that foo writes has come to an end or not. Therefore a
>normal instance of

>	foo | tar ft -

>should not cause a pipe to break, since tar will always terminate after
>foo. I have no clue why tar is exiting prematurely. If anyone can shed
>light on the matter, I think Heinz and I would appreciate it. (Servus,
>Heinz!)

Yep, I do appreciate it. (Servus, Byron ! :)

I looked up the format of a tar-file (tar(5)), which is as follows:

	A ``tar tape'' or file is a series of blocks.  Each block is
	of  size  TBLOCK.  A  file  on  the tape is represented by a
	header block which describes the file, followed by  zero  or
	more blocks which give the contents of the file.  At the end
	of the tape are two blocks filled with binary zeros,  as  an
	EOF indicator.

	The header block looks like:

		#define TBLOCK 512
		#define NAMSIZ 100
		union hblock {
			char dummy[TBLOCK];
			struct header {
				char name[NAMSIZ];
				char mode[8];
				char uid[8];
				char gid[8];
				char size[12];
				char mtime[12];
				char chksum[8];
				char linkflag;
				char linkname[NAMSIZ];
			} dbuf;
		};
(quoted from the man-page)

This proves what was intuitively clear: there's no 'directory' contained in
a tar-file (how would you efficiently maintain a directory on a physical
tape ? :)
So tar has to scan the entire output from the first process in the pipe and 
terminates after this process.

This does not explain the broken pipe, though. I tried the following:

	cat <some_long_file> | more

and killed 'more' by pressing 'q' at the first prompt (so more terminates
first). No 'Broken Pipe'.

Then I tried:

	echo Hallo | (sleep 10; more) # first process terminates first, since
		'Hallo' should fit into the pipe's buffer

No 'Broken Pipe' either.

So is this problem specific to tar ???? I have not encountered it anywhere
else yet. Maybe I should take the time and hack up the source code of bash,
but I'm not sure if it's worth the effort.

Anyone who might have a clue please let me know. It doesn't really bother me
if a pipe brakes (unless it happens in my bathroom :), but it is something
that shouldn't happen, and I wonder why it does.

Greetings,
HH
--
--------------------------------------------------------------------------------
---/     Heinz M. Herbeck                    /    Trust me, I know    /       /-
--/     heinz at sophie.pri.univie.ac.at       /    what I'm doing !    /       /--
-/     Vienna University, Austria          /    (Sledge Hammer)     /       /---
--------------------------------------------------------------------------------



More information about the Comp.unix.shell mailing list