Comment recognition in Lex, again

merlyn at sequent.UUCP merlyn at sequent.UUCP
Tue May 8 02:55:39 AEST 1984


> From: rjhurdal at watmath.UUCP
> Message-ID: <7666 at watmath.UUCP>
> Date: Fri, 4-May-84 14:58:05 PDT
> 
> The puzzles in net.unix-wizards are better than the ones in net.puzzle.
> I blew two hours composing this:
> 	"/*"[^*]*"*"+([^*/][^*]*"*"+)*"/" printf("<<<%s>>>", yytext) ;
> and tested it on this input:
> 	asdqwa/*/asaas*/werwer
> 	sdf/**/sdfsdf/***/erwerwer
> 	cvb/*tdfg*/*xcvcv*/werwe
> 	ty/bcvb/*******/***/fssdf
> and it appears to work.  Please let me know if you come up with cases
> where it doesn't work...

I can't seem to break this one (even spent 20 minutes with a little
"railroad-track" finite-state-machine model).

However, it doesn't appear as elegant as a solution sent in a private
message to me from Andrew Klossner @ tektronix:

	"/*"([^*]*"*"+[^/])*"*"*"/"

This one I can't break either.  (No claims for lack of human error,
however.)

The solution I submitted with start states (mail me if you didn't get
that one) was preferred in another private communication for a reason
that I had failed to notice at the time... all lex regular expressions
which attempt to scarf up a comment in one fell swoop can overflow the
fixed-size "yytext[]" array EASILY.  Take, for example, a typical
start-of-file log produced by the RCS $Log:$ stuff, or an in-source
manpage (yes, I've seen them).  Ick.  Start states avoid that hassle.

It was pointed out to me in that same private communication that any
lex rules that are not qualified by a start state are STILL active
inside the comments.  Boo.  I forgot about that.  I've learned to
prefix ALL my rules by start states (even if it is just <INITIAL>).  I
had forgotten that I do that regularly.

Randal L. ("no comment") Schwartz, esq. (merlyn at sequent.UUCP)
	(Official legendary sorcerer of the 1984 Summer Olympics)
Sequent Computer Systems, Inc. (503)626-5700 (sequent = 1/quosine)
UUCP: ...!XXX!sequent!merlyn where XXX is one of:
	decwrl nsc ogcvax pur-ee rocks34 shell teneron unisoft vax135 verdix



More information about the Comp.unix.wizards mailing list