C comment stripper shell script? -> use sed pipeline

Jim G jim at bilpin.UUCP
Thu Mar 30 21:46:55 AEST 1989


    #{ v_langC.2 }
    IN ARTICLE <2216 at solo8.cs.vu.nl>, maart at cs.vu.nl (Maarten Litmaath) WRITES:
>   jim at bilpin.UUCP (Jim G) [**THAT'S ME, FOLKS!**] writes:
>   \#{ zapcom.sh }
>   \#  Remove comments from a C program
>   \#  sed removes comment strings which begin and end on the same line
>   \#  awk removes comment strings which extend across multiple lines
>   \#  sed/awk both handle nesting of comments within their context
    [small but perfectly formed awk/sed script deleted]
>   
>   Aha! You're using a SHELL script! Well, in that case there's another word
>   for my `sed approach' :-)
>   No awk necessary. This pipeline is reasonably fast too!
    [immense sed script deleted]

    Although I don't dispute the efficacy of the supplied script ( I haven't
    checked it out, though ), I think that this m-iii-ght be taking a
    preference for sed a m-iii-te too far. My 3 line sed + 13 line awk
    script has been replaced by a 101 line script with 66 lines of sed -
    hmmm. Although awk is undoubtedly slower than sed, I use it in
    preference for solving editing problems which can be defined on a field
    basis, as I find it much easier to conceptualise solutions; I do not
    find the sed syntax or operation conducive to an intuitive
    problem/solution association ( obviously some peculiarity in how my
    brain, errrm, works ).

    I aimed for conciseness and a simple, balanced structure in the code
    (rather than maximum efficiency, or universal application), as this is
    easier for people (including me) to understand, and therefore
    alter/improve, if they wish; especially for novice users, who would
    probably feel safe in tinkering with zapcom.sh, but would probably have
    to be restrained and sedated after seeing Cstrip :-)

    Also, zapcom.sh is not universally applicable, in that it requires
    comment delimiters to be themselves delimited by white space/EOL (so awk
    can treat them as individual fields); and it won't handle correctly
    comment delimiters embedded in quotes. There obviously comes a point
    where the effort required to handle a special case outweighs the benefit
    achieved; I considered these cases to come into that category. 

    We have now had a reasonable number of constructive postings on this 
    subject to give all interested parties a good set of approaches from
    which to choose. Thankyou and goodnight ...
-- 
	   <Path: mcvax!ukc!icdoc!bilpin!jim> <UUCP: jim at bilpin.uucp>
  Programmers' maxim : If it's not aesthetically pleasing, it's probably wrong.



More information about the Comp.lang.c mailing list