Read this if you're having trouble unpacking Tcl

Tom Neff tneff at bfmny0.BFM.COM
Fri Dec 28 13:56:56 AEST 1990


In article <1990Dec28.075123.6114 at zorch.SF-Bay.ORG> xanthian at zorch.SF-Bay.ORG (Kent Paul Dolan) writes:
>Uuencoded files have nice, regular, short lines, free of control
>characters, that transit gateways and news software well. I don't want
>to tell someone with a 132 character wide screen who's trying to decide
>whether it's worth the pain and torment to publish their code for the
>benefit of the net that s/he can only write code in the left 3/5ths or
>so of the screen because the USENet news software is braindead.
>
>Allowing programmers to transport the code in a manner that will survive
>the real world net without a prior hand reformat is a must.


This seems to be where we disagree.  I claim that *starting out* with a
portable source format is far more in the interest of the net, than
imposing magic preservation techniques designed to leave non-portable
formats 'intact' for future headscratching.

I don't deny for a minute that uuencoding is the only safe way to pass
someone's 132-wide ^M-delimited C code through netnews.  What I am
saying is that such stuff SHOULDN'T BE PASSED!  It's not generally
useful.  The more disparate the net gets, the LESS useful
platform-specific source formats become.

I also think the burden of portability should be on the shoulders of the
author.  It takes ONE session with a reformatter to render a program
net-portable; it takes THOUSANDS of cumulative sessions, successful or
otherwise, at thousands of user desks worldwide if we make the readers
do the reformatting.  It also promotes variant versions.

>Moreover, uuencoded files of the more modern kind do a line by line
>validity check, much more robust than shar's character count.  I've
>unpacked many damaged source files from the net that had correct
>character counts, but damaged bytes in the files.  This leads to
>subtle and time consuming debugging, since you can easily get errors
>that don't cause compiler errors by trashing just a byte or two,
>especially if you get lucky and hit an operator and convert it to
>a diffent operator.

This is true, but again, a validation failure on a compress+uuencode
posting hits everyone like a 16 ton weight!  Nothing useful can be
salvaged until the author reposts all or most of the archive.

>The transit from ASCII to EBCDIC and back irreversably destroys some of
>the bracket characters, I forget which ones. This is not a trivial
>problem to fix in the source code. Sending the code with a uuencode
>varient that avoids characters that don't exist in both character sets
>avoids that damage.

This is a problem with porting any C code to EBCDIC environments.
Freeze-drying the original ASCII isn't going to be of particular help to
the EBCDIC end-user.

>The savings of 600Kbytes of spool storage space for tcl as sent means
>about 300 news articles can escape the expire cleaver until that
>distribution expires. On a small system like the home hobbiest system on
>which I have an account, that is a great benefit. 

But there are other options for such home hobbyist systems, including
running an aggressive expire on the source groups, or even doing local
compression in-place on the articles (replacing the original '12345'
files with small pointers to the real 12345.Z).

>Your expressed concern is that the files do not meet the "USENet way" of
>distributing source code. This is probably not a surprise to you, but
>we're not just USENet any more; we have subscribers on BITNET, EUnet,
>FidoNet, and many other networks, even CompuServe. 

No no noooo.  We ARE Usenet -- by definition.  Usenet is whoever gets
news.  Don't confuse it with Internet or other specific networks.  We
will always be Usenet.  (Actually, Usenet plus Alternet plus the other
non-core hierarchies, but whatever.)  What's happening is that more and
more disparate kinds of networks are becoming part of Usenet.  They
sport many architectural weirdnesses, but they all benefit from what
should be the Usenet source style: short lines, no control characters,
short hunks, simple (e.g. shar) collection mechanisms, no overloading
lower layer functions (e.g. compression) onto the basic high level
message envelope.

>                                                     Getting source
>material intact through all the possible protocals is a non-trivial
>challenge, but the regularity and limited character set of uuencoded
>data sure helps.  Paying a penalty (around 10%) in communication time is
>at least arguably worth it to be able to tie so much bigger a world
>together.

There are two paradigms at work here.  One is someone writing
BLACKJACK.C and wanting as many different people on as many different
systems as possible all over the world to be able to get it, compile it
and run it.  The other is two PC wankers exchanging Sound Blaster
samples across ten thousand miles of interconnecting networks which
happen to know nothing about PC binary format.  Usenet started out
serving the former paradigm, but has been increasingly used to serve the
latter.  Whether it should is a policy matter.  Encoding the source
newsgroups should NOT be done unless ABSOLUTELY necessary.  My concern
is the growing frequency with which it is done UN-necessarily.

-- 
"The country couldn't run without Prohibition.       ][  Tom Neff
 That is the industrial fact." -- Henry Ford, 1929   ][  tneff at bfmny0.BFM.COM



More information about the Alt.sources.d mailing list