lex/yacc questions from a novice...

Jeffrey W Percival jwp at larry.sal.wisc.edu
Wed Aug 23 02:41:14 AEST 1989


I am trying to use lex and yacc to help me read a dense, long,
machine-produced listing of some crappy "special purpose" computer
language.  I have a listing of the "rules" (grammar?) governing
the format of each line in the listing.

I believe lex and yacc are the right tools, because the set of rules
I have seem to match the spirit of the examples I read in the lex and yacc
papers by Lesk and Schmidt (Lex) and Johnson (yacc).  For example:

digit:		[0-9]
integer:	{DIGIT}+

and so on to the more complicated

command definition: {command introducer} {statement}+ {command terminator}

My first question is how one trades off work between lex and yacc.
Should lex do more than just return characters?  There are all sorts of
keywords in my language that a lexical analyzer could recognize, and
just return tokens for them.

Along these lines, a problem I am having is getting the message "too
many definitions" from lex, when all I have are a few keywords and
ancillary definitions: (lex file included below for illustration).  Is
lex truly this limited in the number of definitions?  Can I increase
this limit?  Or am I using lex for too much, and not using yacc for
enough?

SMSHDR		"SMSHDR"
ENDSMS		"ENDSMS"
CP224		"CP224"
GROUP		"GROUP"
PRT		"PRT"
RTS		"RTS"
SAFING		"SAFING"
BEGINDATA	"BEGINDATA"
ENDDATA		"ENDDATA"
_IF		"_IF"
_ELSE		"_ELSE"
_ENDIF		"_ENDIF"
_MESSAGE	"_MESSAGE"
_SET		"_SET"
_DELETE		"_DELETE"
INCLUDE		"INCLUDE"
LETTER		[A-Za-z]
DIGIT		[0-9]
HEX_DIGIT	[0-9A-F]
OCT_DIGIT	[0-7]
BIN_DIGIT	[0-1]
SPECIAL		[_%#@]
STRING		({DIGIT}|{LETTER}|{SPECIAL})+
WORD		{LETTER}({DIGIT}|{LETTER}|{SPECIAL})*
OCT_MNEMONIC	("_"{STRING})|({WORD})
LABEL		{STRING}":"
LABEL_REF	"'"{STRING}"'"
TEXT_STRING	"'"[ -~]"'"
HEX_INT		'{HEX_DIGIT}+'X
OCT_INT		'{OCT_DIGIT}+'O
BIN_INT		'{BIN_DIGIT}+'B
U_INT		{DIGIT}+
S_INT		[+-]?{U_INT}
U_REAL		{U_INT}"."{U_INT}
S_REAL		[+-]?{U_REAL}
FLOAT		({S_REAL}|{S_INT})([ED]{S_INT})?
YY		{U_INT}"Y"
DD		{U_INT}"D"
HH		{U_INT}"H"
MM		{U_INT}"M"
SS		({U_INT}|{U_REAL})"S"
REL_TIME	[+-]?(({HH})?({MM})?({SS}))|(({HH})?({MM})({SS})?)|(({HH})({MM})?({SS})?)
UTC_TIME	{YY}?{DD}{REL_TIME}
DEL_TIME	({U_INT}C)|({REL_TIME})
ORB_REL_TIME	"ORB,"{U_INT}","{WORD}(","[+-]?{REL_TIME})?
ORB_TIME	"("{ORB_REL_TIME}")"
MFS_TIME	"("({UTC_TIME}|{ORB_REL_TIME})",MFSYNC"(","[+-]?{REL_TIME})?")"
SOI_OFFSET	[+-](({HEX_DIGIT}+"%X")|({U_INT})|({OCT_DIGIT}+"%O"))
SOI		"'"{WORD}({SOI_OFFSET})?"'"[ND]
EOL		"\n"
%%
-- 
Jeff Percival (jwp at larry.sal.wisc.edu)



More information about the Comp.unix.questions mailing list