How can I handle characters with context-sensitive meanings such as in the case where single-quote is both a postfix operator (complete token) and the string delimiter (piece of a token)?

Monty Zukowski

This sounds like a very difficult problem. The first thing to do is figure out how you would do it by hand, because antlr strives to automate what you would do by hand. From the information you give, there seem to be these rules:

If a single-quote is preceeded by a space, then it must be starting a string,

Else it is a postfix operator.

I would solve this by keeping a flag stating whether the last token matched was whitespace(including newlines). Then my lexer rule for single-quote would check, via a semantic predicate, to see if it should be a postfix operator or a string:

SINGLE_QUOTE:
    {preceedingWasWS()}? '''
    | s:SINGLE_QUOTE_STRING {$setType(s.getType());}
    ;

Note that antlr always sets the type of the token at the very end of the rule unless you do otherwise, so to preserve the token type of SINGLE_QUOTE_STRING you must set the type explicitly. Best practice is to set it to the type returned by the called rule in case it returns multiple types.

0 Comments  (click to add your comment)
Comment and Contribute

 

 

 

 

 


(Maximum characters: 1200). You have 1200 characters left.

 

 

About | Sitemap | Contact