Pattern matching with awk

Lou Kates louk at tslwat.UUCP
Wed Mar 6 12:55:10 AEST 1991


In article <1991Mar04.051048.5864 at convex.com> tchrist at convex.COM (Tom Christiansen) writes:
>From the keyboard of lin at CS.WMICH.EDU (Lite Lin):
>:  I'm trying to identify all the email addresses in email messages, i.e.,
>:patterns with the format user at node.  Now I can use grep/sed/awk to find
>:those lines containing user at node, but I can't figure out from the manual
>:how or whether I can have access to the matching pattern (it can be
>:anywhere in the line, and it doesn't have to be surrounded by spaces,
>:i.e., it's not necessarily a separate "field" in awk).  If there is no
>:way to do that in awk, I guess I'll do it with lex (yytext holds the
>:matching pattern).
>
>Well, I wouldn't try to do it in awk, but that doesn't mean we have to 
>jump all the way to a C program!  
>
>    perl -ne 's/([-.\w]+@[-.\w]+)/print "$1\n"/ge;'

The following   awk  program looks   for expressions of the  form
word at word where word contains only letters, numbers  and dots and
the field separator is anything except letters, numbers, dots and
@. You  can  change the regular  expressions in order to vary the
effect:

BEGIN { FS = "[^.a-zA-Z0-9@]+"; 
	word = "[.a-zA-Z0-9]+";  
	addr = "^" word "@" word "$" 
      }
{ for(i=1; i<=NF; i++) if ($i ~ addr) print $i }

Lou Kates, Teleride Sage Ltd., louk%tslwat at watmath.waterloo.edu



More information about the Comp.unix.questions mailing list