Unnecessary tar-compress-uuencodes

Lars Henrik Mathiesen thorinn at skinfaxe.diku.dk
Fri Jul 13 08:05:53 AEST 1990


dave at galaxia.Newport.RI.US (David H. Brierley) writes:
>In article <1990Jul10.182546.26487 at diku.dk> thorinn at skinfaxe.diku.dk (Lars Henrik Mathiesen) writes:
>>	name	       size		crummy ASCII graphics
>>	----------  -------		---------------------
>>	tar	    4718592	tar	 ------- -60.3% ------>	tar.Z
>>	tar.Z	    1874378	+37.8%				 +37.8%
>>	tar.Z.uu.Z  2229065	tar.uu.Z -------  -6.8% ------>	tar.Z.uu.Z

>1) The compressed-uuencoded-compressed file is almost 20% larger than the
>compressed file, therefore you have *increased* my phone bills by 20%.  I
>do not exactly appreciate this.

1) As I wrote, IF you have to post uuencoded material, it should
probably be compressed first. I also wrote that I agree with all the
other reasons the original poster gave to AVOID posting uuencoded
stuff.

I'm not advocating that people waste your bandwith by uuencoding
stuff, I'm trying to prevent a mistaken argument from making people
always post uuencoded stuff non-compressed --- because that often uses
even more bandwith, and almost always uses much more disk space.

Compressing before uuencoding often saves 60% on disk and 5-10% on the
wire --- but sometimes it will only save ~5% on disk and _waste_ ~20%
on a compressed link (some Sun run-length-encoded rasterfiles behave
that way). The poster should try to find out how each of his files
behaves, and pack each of them in the cheapest way; as ``bandwith on
compressed links'' seems to be the most popular cost metric, cheapest
probably means ``smallest after compression''. And then make a shar
archive of the packed files, so people can decide which they want to
unpack.

Another problem with this: The result of compressing a single file may
be very misleading when we really want to know how much larger it
makes a compressed batch of news articles. Compress is a very stateful
representation, and in a given batch it may not be able to compress a
uuencoded file nearly as much as when taken alone. So even the worst
rasterfile example may not affect the size of a batch as much as the
numbers lead one to believe. (Normally, compress gets ~13% after any
uuencode; in these examples, it gets ~30% after uuencode, but only the
usual ~13% after compress-uuencode. In the middle of a batch, the
difference might shrink a lot --- possibly to the point where
compress-uuencode wins again because it starts out 5% smaller.)


2) I hope you realize that a tar achive has binary file headers and
cannot be posted without some sort of encoding, so your 20% are not
immediately applicable. However, anybody who uuencodes something which
would have got through news as well without encoding deserves your
scorn and anger (and in my opinion, this includes anybody who posts a
tar archive consisting of ASCII files).

And I don't understand why ASCII/EBCDIC problems should be an excuse
for uuencode, either. The format uses the ASCII characters '!', '['
and ']', which are among those I've most often seen altered in
ASCII->EBCDIC->ASCII translations. If a uuencoded file gets through
unscathed, odds are that any printable ASCII file would. But maybe
somebody wrote a uudecode which takes input in EBCDIC and outputs in
ASCII?

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn at diku.dk



More information about the Alt.sources.d mailing list