using awk with records that have over 100 fields

Tom Christiansen tchrist at convex.COM
Thu Jan 3 00:39:11 AEST 1991


>From the keyboard of skwu at spot.Colorado.EDU (WU SHI-KUEI), quoting me:
:>Run your awk script through the awk-to-perl translator, a2p, then run perl
:>on the resulting script  ......
:
:No need for 'perl', a boon to the majority of UNIX users that do not use it.
:Simply replace the first, white space field separator with a some, otherwise
:unused glyph (i.e. @) using 'sed' and then set the awk FS to that glyph.

While for this particular application, it may well be that this solution
suffices, there remain all kinds of internal limits you're going to run into
with awk.  Eventually these will annoy you enough to stop using it for
large and/or complex problems.  For example, if the application were to
build an associative array of word-frequencies and you had the tremendously 
long lines described by the original poster, then awk wouldn't be able to
handle it, causing you to go through brain-twisting and gut-wrenching
contortions to pound the data back into something awk can handle.

Although perl isn't really new anymore, it's still generally perceived to
be so, and the resistance to new, useful tools in the community is so high
that some people will insist on shooting themselves in the foot using old,
limited (and even brain-damaged) software for years in the future.  (Yes,
I know it's hard to get things standardized across millions of systems,
but that shouldn't stop us from striving to forge ahead.)  My suspicion is
that this is just a manifestation in Unixdom of a principle familiar to
sociologists and historians.  While the desire to embrace better
technology may be somewhat higher amongst computer users than in the
general populace, there will always be some who wish to live (if you can
call that living) in a totally static environment where nothing ever
changes, where no improvement is ever radically different from previous
practice, and where the JCL scripts from 25 years ago still function.

Use awk while you can.  When you can't, be aware that there's an easy,
portable, freely-available upgrade path that doesn't require recoding
everything in C, and is a lot easier than trying to get AT&T to invest
the time in fixing awk.  You could reasonably argue that there are
actually two such paths, since gawk comes close to meeting these criteria:
it has greatly increased the limits of things like line length and number
of fields.  However, these limits still exist even in gawk, whereas in
perl they're entirely removed, so gawk may not be enough.  It all depends
on the problem.  Different problems are often best solved by employing
different tools, even if perl is the Swiss army chainsaw of UNIX.  

--tom
--
Tom Christiansen		tchrist at convex.com	convex!tchrist
"With a kernel dive, all things are possible, but it sure makes it hard
 to look at yourself in the mirror the next morning."  -me



More information about the Comp.unix.questions mailing list