regular expression survey

Thu Dec 11 07:25:53 AEST 1986

I am writing a lex(1) program and in the process I have been playing
around with dfa's.  Usually, most searching algorithms (like the one
in emacs) simulate an nfa instead of going to a dfa.  Actually this is
not true either.  The \digit construct in the search pattern is a form
of infinate memory which calls for a more powerful machine than a dfa.
For example, you can say \(a.*c\)\1 which matches strings like acac,
abbcabbc, etc.  If the pattern between the \(\)'s was a finite length
then it would be possible (although extremely expensive) to do this
with a dfa.

The ability to have \(\) in the search pattern so that you can have
\digit in the replacement pattern is very powerfull and used
frequently.  However, as I was playing with my lex program, I think I
have found a way to cope with \(\) and still use a dfa.  The \digit
can not possibly be done with a dfa as stated.

So the question is rather anyone uses the \digit construct in the
search pattern, how often, and could you live without it?  The point
is that if a dfa could be used, the search speed should be extremely
fast.  Would the increase in speed justify the loss in flexibility?

(send e-mail only please)
-- 
Perry Smith
{convex!ctvax,{ti-csl,infotel}!pollux}!bobkat!pedz