Want a way to strip comments from a

John Rupley rupley at arizona.edu
Mon Mar 20 20:34:57 AEST 1989



In article <3145 at nunki.usc.edu>, jeenglis at nunki.usc.edu (Joe English) writes:
> I made a mistake in the comment-eating program I
> posted yesterday -- it won't handle
> 	/* something like *//* this. */
> Change the line in the '/' case from:
>     if ((ch = getchar()) == '*') { eatcomment(); ch=getchar(); }
> to:
>     if ((ch = getchar()) == '*') { eatcomment(); ch=getchar(); continue; }
> and it will work.  If anyone's interested.

It still doesn't work.  It won't uncomment itself.  Or the following line:

	'"' /* hi there */ '"'

Or distinguish a correct string, with escaped newlines,

	"hi\
	/*\*/ /**/\
	there"

from an incorrect string without the escapes.

The point is not _whether_ one can write an ``uncomment'' in C, but how,
and in what language, one can do it most simply.  It is certainly right
to use C if uncommenting is part of a larger design, as in cpp or ctags.
But if the whole aim is to uncomment, then a pattern-handling language,
such as Lex, is more appropriate.  A few lines of Lex source do the job,
and assuming familiarity with regular expression syntax, it is easy to
write and understand, and hard to get the logic wrong.  It should be
doable with sed or awk, but probably not as easily, because they see a
file as a stream of lines rather than characters.  In C, the proper
setting up of the switch and flags is not trivial, as the previous
posting witnesses.

A Lex source for uncommenting is attached (which I hope does not belie
the remark above about hard to get the logic wrong :-).
 

John Rupley
 uucp: ..{uunet | ucbvax | cmcl2 | hao!ncar!noao}!arizona!rupley!local
 internet: rupley!local at megaron.arizona.edu
--------------------------------------------------------------------
%{
/* UNCOMMENT- */
/*	regexp for comment recognition based on usenet posting by: */
/*	Chris Thewalt; thewalt at ritz.cive.cmu.edu */
%}
STRING		\"(\\\n|\\\"|[^"\n])*\"
COMMENTBODY	([^*\n]|"*"+[^*/\n])*
COMMENTEND	([^*\n]|"*"+[^*/\n])*"*"*"*/"
QUOTECHAR	\'[^\\]\'|\'\\.\'|\'\\[x0-9][0-9]*\'
ESCAPEDCHAR	\\.
%START	COMMENT
%%
<COMMENT>{COMMENTBODY}		;
<COMMENT>{COMMENTEND}		BEGIN 0;
<COMMENT>.|\n			;
"/*"				BEGIN COMMENT;
{STRING}			ECHO;
{QUOTECHAR}			ECHO;
{ESCAPEDCHAR}			ECHO;
.|\n				ECHO;
---------------------------------------------------------------------------



More information about the Comp.lang.c mailing list