A complete example (long) (was Re: yacc & lex - cupla questions)

Gilles Courcoux gilles at pase60.Convergent.Com
Sat Jul 28 08:08:14 AEST 1990


In article <1990Jul26.175831.1216 at uicbert.eecs.uic.edu> you write:

>1.)  how does one redefine the i/o in a yacc/lex piece of code?  i.e.
>the code which is generated defaults to stdin and stdout for input and
>output, respectively.  i'd like to redefine these defaults w/o having 
>to hack on the intermediate c-code, since this is a live production 
>project; i'd like to be able to update and modify the program simply by 
>saying "make". 

Bad answer:
-----------
You may patch a C-source file automagically from within a makefile
by writing a sequence of shell commands like :

cat <<-EndOfScript | ed - file.c
	edCommandNumber1
	edCommandNumber2
	edCommandNumber3
	w
EndOfScript

It's work well.

Better answer:
--------------
The defaults are not stdin stdout but yyin and yyout wich are
globally known variables inside the YACC and LEX sources files.

Remember the YACC source file format. You may define routines
in them. So you may define a main program entry point wich will fopen(3)
the wanted input and output files and assign the obtained values to
the global variables yyin and yyout. Clean!

Note)
Look at the C-source code generated. It won't take you a lot of time
and will give you a more precise feelling of how to do things (it's
sometime hard to catch some points just by reading a book or gessing).

Book)
Try the big blue book : 'Compiler construction under UNIX' published by
Prentice Hall. I don't have the author's names now but if you are
interested I will mail the full reference.

>2.)  how can one get the automagically-defined #defines, which can
>normally be created from yacc with the -d flag, to come out when you
>use a makefile?  i.e. suppose i have lex.l and yacc.y lex and yacc
>source files, respectively, and i have object files defined in my makefile 
>called lex.o and yacc.o such that "make" follows default rules to create 
>these from the aforementioned source files.   

A2)
It is a make(1) problem. Look at the doc's.
Remember the YFLAGS you see when you type:
	$ make -pn	# 'list macros' flag and 'do nothing' flag

it contains the flags the implicite rule .y.o: will use when
updating your yacc.o file from the yacc.y file. So just enter the
following line in your makefile
YFLAGS = -d

>3.)  if i have a yacc construct such as:
> 
>line3	: 	A B C
>		{  yacc action sequence }
>
>
>which indicates that the construct line3 is composed of the 3 tokens
>A B and C, in that order ...
> 
>how can i now assign the values of A, B, and C into local vars of my
>choice?  the problem lies in the fact that each of A B and C represent
>three calls to lex, and if i pass back a pointer to yytext[] from lex, 
>i only retain the value of the last token in the sequence, in this case C, 
>when i get to the action sequence in my yacc code.  what if i want to 
>be able to select the EXACT ascii tokens for each of A B and C above in 
>my yacc code.  how do i do that?

A3)
You may assign a type to tokens returned by LEX.

More than a lot of words, let me draw the picture of a simple parser
that split an input file into words blank separated, and that show
you the redirection features of LEX at work.

The sample input file is tst.input, the obtained output is
tst.output as shown in the makefile command line corresponding to
the all: target.

Hope this helps you, and that it is self explanatory.

---------------------------------- cut here ------------------------------------
============================== makefile ==============================
YFLAGS = -d
OBJECTS = y.o l.o
LDFLAGS = -ly -ll

all: tst tst.input
	tst tst.input tst.output

tst: $(OBJECTS)
	cc -o $@ $(OBJECTS) $(LDFLAGS)
============================== y.y ==============================
%union {
	char *	string;
	int	value;
}
%token <string> WORD
%token <value> NUMBER

%%

input
	: /* nothing */
		{
			fprintf(yyout, "Nothing to read\n");
		}
	| tokens
		{
			fprintf(yyout, "End of file detected\n");
		}
	;

tokens
	: token
	| token tokens
	;

token
	: word
	| number
	| numberWord
	;

word	: WORD
		{
			fprintf(yyout, "word: %s\n", $1);
			free ($1);
		}
	;

number	: NUMBER
		{
			fprintf(yyout, "number: %d\n", $1);
		}
	;

numberWord
	: NUMBER WORD
		{
			fprintf(yyout, "received: <%d %s>\n", $1, $2);
			free ($2);
		}
	;

%%

#include <stdio.h>
#include <varargs.h>
extern FILE*	yyin;		/* imported from LEX */
extern FILE*	yyout;		/* imported from LEX */
static char*	progName;

	void
quit(format, va_alist)
	char *	format;
	va_dcl
{
	va_list argp;

	va_start(argp);
	fprintf(stderr, "%s: ", progName);
	vfprintf(stderr, format, argp);
	exit (1);
	va_end(argp);

} /* quit */

	int
main(argc, argv)
	int argc;
	char * argv[];
{
	progName = argv[0];

/* already done by LEX generated C-code
/*	yyin = stdin;
/*	yyout = stdout;
/*/

	switch (--argc) {

		/* 2nd argument is output file */
	case 2:
		yyout = fopen(argv[2], "w");
		if (! yyout)
			quit("Cannot access output file %s\n", argv[2]);
		/* fall through !!! */

		/* 1st argument is input file */
	case 1:
		yyin = fopen(argv[1], "r");
		if (! yyin)
			quit("Cannot access input file %s\n", argv[1]);
		break;

		/* no argument given: use standard path */
	case 0:
		break;

	default:
		quit("Usage is\n\t%s inputFile outputFile\n", argv[0]);
	}

		/* input and output are initialized */
	while (yyparse())
		;

} /* main */
============================== l.l ==============================
 /* first white space is IMPORTANT : LEX put the same line WITHOUT
 the white space in front of the resulting C file
 */

 # include "y.tab.h"
 char * malloc();

space      [ \t\r\n\f]
digit      [0-9]
nonDigit   [^0-9 \t\r\n\f]
nonSpace   [^ \t\r\n\f]
number     {digit}{digit}*
word       {nonDigit}{nonSpace}*

%%

{number}   {
	int value = atoi (yytext);
	yylval.value = value;
	return NUMBER;
}

{word}     {
	yylval.string = malloc(strlen(yytext)+1);
	strcpy(yylval.string, yytext);
	return WORD;
}

{space}    ;
============================== tst.input ==============================
yacc lex
			4sale big2
	23423
					yard4feet
============================== tst.output ==============================
word: yacc
word: lex
received: <4 sale>
word: big2
received: <23423 yard4feet>
End of file detected
---------------------------------- cut here ------------------------------------

Gilles Courcoux                        E-mail: sun!pyramid!ctnews!pase60!gilles
Unisys Network Computing Group         Phone:  (408) 435-7692



More information about the Comp.unix.wizards mailing list