LEX

Tom Stockfisch ix426 at sdcc6.ucsd.EDU
Thu Feb 4 14:58:23 AEST 1988


In article <260 at nyit.UUCP> michael at nyit.UUCP (Michael Gwilliam) writes:
>
>....  When I
>was writting the tokenizer using LEX and I got intrigued by a little
>problem.  Is it possible to write a regular expression that will
>transform a /* comment */ into nothing? ....
>So my question is, to all you experienced lex
>users and compiler writers, can this be done?  Or do I need to
>use input() and other lex functions.

[sorry for not emailing, I can't seem to get mail to Michael]

I can't believe how hard this task is in regular expressions, when it is
trivial to code by hand.  I have found a solution which I think is correct,
but it took several tries (see end of this posting).

To convince yourself that a pattern is correct, I think you have to show
two things
	1.  That the body between the "/*" and "*/" cannot possibly contain
	    a "*/",
	2.  That the body can contain any other sequence of characters.

If you come up with your own solution, be sure it works properly on the
following input.

1.	/*****//hello world */

2.	/* hello /* /* world */

3.	/* */ hello /* */

4.	/**// /* this input should produce "/ \n" for output */

5.	/* */ hello */


The following lex source should "elide" all legal comments, and pass all
the rest thru to stdout.  As requested, it does not use input().

--cut----

okslash	([^*/]"/"+)

%%
"/*""/"*([^/]|{okslash})*"*/"	;

--cut----

Compile using

	lex comment.l; cc lex.yy.c -ll
-- 

||  Tom Stockfisch, UCSD Chemistry   tps at chem.ucsd.edu



More information about the Comp.lang.c mailing list