Click to share on Facebook (Opens in new window)

Using literal character tokens when designing lexers and parsers.
Filed under:.


Thursday January 01, 1970

— Leave a comment May 11

2014 Sometimes while I exploring the source code of various free software Flex lexers and Bison parsers I see name declarations for single character tokens.
I present some of these that will be used later on for demonstration reasons: “+”, “-“, “*”, “/”, “=”, “|”, “(“, .


Thursday January 01, 1970

“)” Software architects usually use named tokens for these characters

I believe that there is no need to declare literal character tokens unless we need to declare the type of their values.
Rather than giving every token a name, it’s possible to use a single quoted character as a token, with the ASCII value of the token being the token number (Bison starts the numbers for named tokens at 258, so there’s no problem of collisions).
By convention, literal character tokens are used to represent input tokens consisting of the same character; for example, the token ‘+’ represents the input token +, so in practice they are used only for punctuation and operators.
There is a common idiom or a design pattern in which we can handle all single-character operators with the same rule that returns “yytext[0]”, the character itself, as the token.


Thursday January 01, 1970

Here is a code snippet of a simple Flex lexer that uses this common idiom: %%

more lexer rules.
“+” | “-” | “*” | “/” | “=” | “|” | “(” | “)” { return yytext[0]; }.
more lexer rules.
%% Also, a Bison parser can use in its BNF grammar rules the literal character tokens as single characters.
Here follows a small code snippet as an example for a grammar rule that describes an expression in a programming language : %%.
more grammar rules.
Exp   : exp ‘+’ exp           { $$ = new_ast_node (‘+’, $1, $3); }   | exp ‘-‘ exp           { $$ = new_ast_node (‘-‘, $1, $3);}   | exp ‘*’ exp           { $$ = new_ast_node (‘*’, $1, $3); }   | exp ‘/’ exp           { $$ = new_ast_node (‘/’, $1, $3); } | ‘|’ exp { $$ = new_ast_node (‘|’, $2, NULL); }   | ‘(‘ exp ‘)’           { $$ = $2; }   | ‘-‘ exp %prec UMINUS  { $$ = new_ast_node (‘M’, $2, NULL); }   | NUMBER                { $$ = new_ast_number_node ($1); }   | NAME                  { $$ = new_ast_symbol_reference_node ($1); }   | NAME ‘=’ exp          { $$ = new_ast_assignment_node ($1.

$3); } | NAME ‘(‘ ‘)’ { $$ = new_ast_function_node ($1

NULL); } | NAME ‘(‘ exp_list ‘)’ { $$ = new_ast_function_node ($1, $3); } ;.
more grammar rules.
%% Rate this:.
Share this:.
Click to share on Facebook (Opens in new window).

Click to share on LinkedIn (Opens in new window)

Click to share on Twitter (Opens in new window)

Click to print (Opens in new window)

Click to email this to a friend (Opens in new window)

Like this:.
Like Loading.
Related.
Tags: , , , , , literal, , .

Token Comments RSS feed Leave a Reply Cancel reply

Enter your comment here.
Fill in your details below or click an icon to log in:.
Email (Address never made public) Name Website You are commenting using your WordPress.com account.
( Log Out /   ) You are commenting using your Google account .
( Log Out /   ) You are commenting using your Twitter account.
( Log Out /   ) You are commenting using your Facebook account .
( Log Out /   ) Cancel Connecting to %s Notify me of new comments via email.
Notify me of new posts via email.

« Implementing the “include” directive to support nested input files

How to unify similar tokens when constructing parsers and lexers.
».
(79).
(21).
(15).
(26).
(4).
(7).
(55).
(24).
(4).
(16).
(14).
(4).
(7).
(10).
(78).
(11).
(9).
(1).
May 2014 M T W T F S S  1234 567891011 12131415161718 19202122232425 262728293031   « Mar Sep ».
(2).
(4).
(1).
(1).
(2).
(1).
(1).
(1).
(2).
(1).
(9).
(1).
(8).
(1).
(1).
(2).
(4).
(7).
(1).
(1).
(1).
(8).
(12).
(1).
(2).
(1).
(2).
(1).
(2).
(1).
(1).
(4).
(20).
(13).
(5).
(2).
(10).
(13).
(10).
(10).
(20).
287,005 hits.
Send to Email Address Your Name Your Email Address Cancel Post was not sent – check your email addresses.
Email check failed.

Please try again Sorry

your blog cannot share posts by email.
%d bloggers like this:.

Leave a Reply

Your email address will not be published. Required fields are marked *