Regular Expression tool

Lars Henrik Mathiesen thorinn at skinfaxe.diku.dk
Wed Jun 13 04:50:41 AEST 1990


lwall at jpl-devvax.JPL.NASA.GOV (Larry Wall) writes:
>In article <1990Jun8.174056.15313 at icc.com> wdm at icc.com (Bill Mulert) writes:
>: Consider the following statements containing regular expressions:
>: ...
>: Fortunately, we have cdecl to help create and decode the C declarations.
>: 
>: I wish there were something similar for regular expressions.

>It's not likely to be too practical, for a couple of reasons.

>...

>Second, your big problem is not so much the regular expressions themselves
>as it is all the quoting you have to put around them because of the paucity of
>quoting mechanisms.

What we really need is a shell script explainer. It would know Bourne
shell syntax; when you run a script through it, any shell
single-command which uses more than one level of quoting will be
explained in excruciating detail. It would also know enough about
expr, sed, egrep etc. to recognize regular expressions, and they would
be converted to a standard form (perl's, maybe). (Perl, of course, is
self-explanatory (and much too hard to parse)).
Example of possible output: 

echo "`expr \"$1\" : \"^[^=]*=\(.*\)\"`"
#is taken as: echo "@1"
#where @1 is: `@2`
#where @2 is: expr "$1" : "@3"
#where @3 is: ^[^=]*=(.*)		Literal: "^[^=]*=\\(.*\\)"

df_usr=`df | sed -n '/^\/usr[   ]/s/[^)]*):[    ]*\([^  ]*\).*/\1/p'`
#is taken as: df_usr=`@1`
#where @1 is: df | sed -n '@2'
#where @2 is: /@3/s/@4/@5/p
#where @3 is: ^/usr\s			Literal: "^\\/usr[ \t]"
#where @4 is: [^)]*\):\s*(\S*).*	Literal: "[^)]*):[ \t]*\\([^ \t]*\\).*"
#where @5 is: $1			Literal: "\\1"

The Literal: strings (which I have written as C strings) should be
present whenever an argument to a command contains tabs or control
characters, or when it is converted as a regular expression.

The thing doesn't really have to parse shell language: Just cut at
newline, ';', ';;', '|', '||', ... (when unescaped), repeatedly strip
'if', 'for', '{', ... from the beginning of strings, and the
single-commands are left. The ``parsers'' for the regexp commands just
have to find the regexps; they can probably be just as simple. It
could probably be implemented in perl fairly easily.

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn at diku.dk



More information about the Comp.unix.questions mailing list