Unnecessary tar-compress-uuencodes

Bengt Larsson bengtl at maths.lth.se
Fri Jul 13 12:22:24 AEST 1990


In article <3124 at psueea.UUCP> kirkenda at eecs.UUCP (Steve Kirkendall) writes:

(a list of valid points concerning a "shar" program)

>Did I miss anything?  Did I get anything wrong?  Does anybody know of an
>existing format that comes close to these specs?

Hmmm, one way to do it would be to write a little "unpacker" program (in C),
and distribute it with the archive (in plain text).

Suggested format for archive: (borrowed heavily from VMS_SHAR, the "shar"
program for VMS)

(unpacker program (optional) in plain text here. Let's call it "unpacker.c"
 For those who don't have it, extract it from here, compile it with 
 "cc -o unpacker unpacker.c" and start unpacking)

-- start part 1 --
file packer.txt 744 23642334
X The filename is on a line started by "file", followed by one space,
X followed by the filename, a space, a (Unix) protection code (octal, 
X like for "uuencode") and a checksum. The filename must not contain a space.
X 
X The archived file is mostly normal text. Control characters are escaped 
X with a backtick followed by three characters with the decimal (octal?) 
X value of the escaped character. Like `009 for a tab. The backtick 
X is itself escaped like `096.
X 
X Long lines are folded. Normal lines start with an "X". Continuation
V lines (like this one) start with a "V" (that is, newlines are to be skipped
V before "V").
X
X Since all lines start with a special character, it is possible to
X archive archives (the archived file ends with a line not starting with
X "X" or "V").
X
X Trailing blanks are escaped, just like control characters. 
X Trailing blanks which result from splitting a long line are also
X escaped. When run through the unpacker, all trailing spaces are 
X stripped first (trailing blanks may have been added somewhere).
X
X This is a line with some trailing spaces...        `032
-- end part 1 --

Anything may come here (News headers, for example).

We start the next part with a line which start with "-- start part 2".
Note that the headers etc. may be in the middle of a file. All parts
in the archive have the same length. Archived files are split 
routinely between parts.

All the unpacker has to do is to look for a line starting with 
"-- end part xx" and then skip to a line beginning with "-- start part xx+1".
The unpacker may (should?) check that the "xx" numbers are correct and
in sequence.

-- start part 2 --
X 
X Now we can say something about directories. Lets start a new file
X "subdir.txt" in a subdirectory "doc".
directory doc 744
file doc/subdir.txt 744 2353453
X 
X Now we are in the subdirectory. A directory is created by a line
X started by "directory". The subdirectory may already exist (that is no
X error). Anyway, the protection code is specified like for files.
X 
X When files in a subdirectory are specified, directory parts are separated
X by "/" (like in Unix). This should make it possible to write unpackers
X for other environments (for example VMS).
X
X Let's say that the archive should be terminated with a line
X "end archive".
X
-- end part 2 --
end archive

The unpacking program could be run like:

  % unpacker prog.pck.01 prog.pck.02 prog.pck.03 ...
  
  or (Unix)
  
  % cat prog.pck.?? | unpacker.

What do you think? The idea was that the "packer" program may be somewhat
complex, but the "unpacker" should be small (could be distributed with
the archive in plain text). The "packer" could accept lots of options
(for example, which characters to escape, the maximum line length, the 
maximum part size, maybe maximum length for filenames etc.). Reasonable
defaults should be provided. 

I think the "packer" should default to the "safest" format (escaping
tabs and special characters for Bitnet). If the escaping mechanism
is turned off, this is just a file splitter/extractor (may be used
to split uuencoded GIF files, for example :-)

Bengt Larsson.
-- 
Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden
Internet: bengtl at maths.lth.se             SUNET:    TYCHE::BENGT_L



More information about the Alt.sources.d mailing list