Unnecessary tar-compress-uuencodes

Bengt Larsson bengtl at maths.lth.se
Sat Jul 14 09:46:26 AEST 1990


In article <3119.269d97ea at mccall.com> tp at mccall.com writes:

>I agree that the VMS_SHARE format is quite good. However, the VMS_SHARE
>format is self unpacking on VMS, just like a unix shar is on unix, with no
>tools that are not part of the OS. The VMS_SHARE is a DCL command procedure
>that contains a TPU program (TPU is the VMS programmable text editor) to
>unpack the files. 

Yes, that's the big advantage (having TPU around as a standard (although
there were some problems with the "standard" from VMS 4.x to VMS 5.x!)).

I must confess that I'm not much of a C-programmer, so I really can't say
if the "unpacker" could be written portably. It just seemed to me that
except for sed, awk, sh etc. the most portable thing on Unix would be a 
small C program. But, as I said, I'm no expert on portable C.

>The problem with a C program, is that it is VERY hard to write a portable
>program with no #ifdef's that will do the job. If you go this route, write
>it strictly as a filter, and invoke it just like you do sed in current
>shar's, with the <<'EOF' input specifier and the > redirection for output.
>And for heaven's sake, ONLY WRITE ONE OF THEM, and use only one name for
>it! Whatever you write, I have to recognize explicitly (I maintain a VMS
>unshar program, and I'm not interested in making it into a full Bourne
>shell.)

Hmmm, I can imagine the problems with writing an "unshar" for VMS
(I'm more familiar with VMS than Unix).

As you say, the most important is that whatever is, is a _standard_.
A rigidly standardized version of "shar" would do as well.

And maybe it would be useful to use the <<'EOF' method for feeding parts
to the unpacker. Not so pretty, though :-)

>Perhaps this is overkill? Wouldn't it be possible to escape the most
>troublesome characters in such a way that you could still use sed to unpack
>it? Anyone currently unpacking unix shar's has already emulated sed to some
>degree, adding a few more substitute commands couldn't be hard. I don't 
>advocate using AWK, while there is a
>VMS version, it is large and not widely installed. I suspect MSDOS or Amiga
>sites would have similar problems. 

Maybe it is overkill. But what about folded long lines? Can that be unpacked
with sed? Substituting the most important characters (for example tab) would
be doable in sed, I think.

>Final note about the other proposed format, DON'T mung spaces into tabs.

Agreed. Tabs should be preserved.

Anyway, I hope my proposed format made some food for thought, especially for
Unix people. It would be much more portable to different systems than
the current versions of "shar". VMS_SHARE is certainly something to be
inspired by.


Summary of features in VMS_SHARE not present in "shar"s (at least not all 
of them):

  1. Escaping of all characters which are a) not printable ascii, b)
     likely to be munged by Bitnet.
  
  2. Folding of long lines. 
  
  3. Automatic skipping of News headers and such.
  
  4. Checksums as standard (This uses the verb CHECKSUM which comes with
     VMS).
     
  5. Archived files are routinely split between archive parts, to keep each
     part a standard size. This of course also handles files bigger than
     any of the archive parts.


Features of my proposed format (advantages relative to "shar"):

  1. Much more portable to different architectures (not just Unix).
  
  2. It's easy to find the file names, since they are on lines starting
     with "file".
  
  3. A standard checksum built in (maybe not a CRC, but something
     more powerful than a character count).
  
  4. Much more protection against character munging through character
     escapes.
  
  5. Automatic skipping of News headers and such when unpacking.
  
  6. Handles splitting of files between archive parts routinely. Handles
     achiving of files bigger than any archive part..


Disadvantages relative to "shar":

  1. Slightly less readable, especially if many characters are escaped.
  
  2. You must have an "unpacker" compiled (it may be distributed with
     the archive).

I'm sorry that I'm not much of a C programmer: I will not be implementing 
this myself.

Bengt Larsson.
-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl at maths.lth.se             SUNET:    TYCHE::BENGT_L
-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl at maths.lth.se             SUNET:    TYCHE::BENGT_L



More information about the Alt.sources.d mailing list