Breaking large file into pieces

Larry Wall lwall at jpl-devvax.JPL.NASA.GOV
Thu Sep 13 10:54:35 AEST 1990


In article <26116 at boulder.Colorado.EDU> skwu at spot.Colorado.EDU.Colorado.EDU (WU SHI-KUEI) writes:
: The right tool for the job is NOT perl but 'csplit'.

"Those words fall too easily from your lips."  --Gandalf

Let us attempt to distinguish fact from dogma.

    1)  As far as I can tell, csplit is AT&T proprietary.  I certainly
	don't have it on all my machines, and don't know offhand where
	I'd find the source for it.  The person we were advising may
	well not have it on his machine.  You should at least say "If
	you have csplit..."

    2)	The man page for csplit (in the AT&T universe of a Pyramid, anyway)
	indicates that you can have a maximum of 99 output files.  The
	application in question could easily have more than that, judging
	by how it was specified.  A general tool should not have
	such limitations.

    3)	csplit won't name the files in the way specified--you'd have to
	follow it up with a loopful of mv commands, one process per file.
	And in the naive implementation, you'd have a sed or awk for each
	file to extract out the filename to hand to mv.

    4)	csplit can't recognize patterns across newlines (not that this
	job required that, but a general tool shouldn't have such
	limitations.)

    5)	csplit can get confused on lines longer than 255 chars.  It can't
	handle embedded nulls.  A general tool should not have such
	limitations.

    6)	Even if I did manage to find a freely available source for csplit,
	I'd have to worry about recompiling it on all my different
	architectures.  That would be okay (after all, I have to do that
	with Perl too), but I have to do it for 50 blue jillion other little
	"must have" tools too.  I'd much rather compile Perl once on
	each architecture, rewrite csplit in Perl, throw it into my
	/u/scripts directory that's mounted everywhere, and never worry about
	recompiling csplit again.

So it's not quite so simple as all that.  You can chop down a tree with
a hatchet, but sometimes you want an industrial strength Swiss Army Chainsaw.
And sometimes not.  There's more than one way to do it.

Larry



More information about the Comp.unix.shell mailing list