Shared Memory --- Parallel filters and piping -- Examples Needed

Tue Mar 1 03:51:00 AEST 1988

   From: Sven-Ove Westberg <sow at cad.luth.se>

   In article <11876 at brl-adm.ARPA> rbj at icst-cmr.arpa (Root Boy Jim) writes:
   |
   |   From: "John S. Robinson" <jsrobin at eneevax.uucp>
   |
   |			      filt3
   |			     /	    \
   |			    /	     \
   |   filt1 < <stream> |filt2 /__filt4___+ filt5 | filt6 | ... filtn > <sink>
   |			   \	     /
   |			    \	    /
   |			     \filt5/
   |
   |My diagram will look something like this:
   |
   |   f1 < <stream> | f2 | parallel 'f3' 'f4 -opts' 'f5a' | f5b ...
   |

   Why not use named pipes? 

Because they don't exist! At least not in *my* UNIX. Besides, as I wrote:

jsr|   How does one handle the case where some of the above filters are to be
jsr|   applied in parallel and then be recombined:
jsr|
rbj|It depends on what you mean by `recombined'. Do you want the output of the
rbj|parallel filters in order, or do you want them to run asynchronously,
rbj|mixing their output? BTW, you have two `filt5's in your diagram.

I think the `@' program previously posted, and the idea of using named
pipes is rather elegant. I have often wanted to split a pipe stream
into two, but never wanted to recombine them, altho I suppose one use
might be simple collection of output into one file.

Other people have also mentioned using awk to write to two pipes as well,
along with caveats that it might be somewhat buggy in some environments.
I have also heard that the Korn shell can do something like this.

The key word here is `recombined'. If you want each line to be processed
by one and only one randomly chosen filter, then previously suggested
programs using named pipes seem the way to go. However, this seems
quite useless, and I wonder what is the real problem we are trying to solve.

If, on the other hand, you want each line to be processed once by each
filter, and then recombined, you need a program to clone the input and
distribute it to each filter. How the output is combined is another
question not specified very well in the original problem either. Do
we want the output randomly assembled, or do we want all the lines
from the first filter before any of the others. If we want the former,
than it is arguable that the output is meaningless; if we want the
latter, we sacrifice most of the parallelism except for short programs.

To illustrate where pipe splitting just might be useful, consider
the following example. I want a list of what gets dumped to tape
when I do backups, but I don't want to read the tape twice. What I
have to do is:

	dump 0u /
	restore tv >& DUMPLIST

What I would like to do is:

	dump 0uf - | tee /dev/rmt8 | restore tvf - >& DUMPLIST

Hey, this almost works! Unfortunately, the magtape is unblocked,
so I would need to do something like:

	dump 0uf - | tee "| dd bs=20b > /dev/rmt8" | restore ...

Hmmm, maybe I will try the awk trick.

   Sven-Ove Westberg, CAD, University of Lulea, S-951 87 Lulea, Sweden.
   Tel:     +46-920-91677  (work)                 +46-920-48390  (home)
   UUCP:    {uunet,mcvax}!enea!cad.luth.se!sow
   Internet: sow at cad.luth.se

	(Root Boy) Jim Cottrell	<rbj at icst-cmr.arpa>
	National Bureau of Standards
	Flamer's Hotline: (301) 975-5688
	YOW!!!  I am having fun!!!