dcsimg
Bug in lexer generation for 2.7.2
0 posts in topic
Flat View  Flat View
TOPIC ACTIONS:
 

Posted By:   Anonymous
Posted On:   Monday, June 28, 2004 11:41 PM

Consider the following simplified lex rules: ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ; protected SCHEME : ( "http" | "file" | "pl" ) ; URL : SCHEME ':' (~(' '))* ; clearly there is lexical non-determinism unless k is > 4, but the actual lex code will fail to match the valid IDs 'hile', 'htle', 'htte', ... because of two things: the generated if guarding the mURL() call only tests as many characters as there are in the shortest SCHEME string, and the code tests to see if the ith character matches any of the ith characters of each of the SCHEME strings what does this mean? It means   More>>

Consider the following simplified lex rules:

			
ID : ('a'..'z'|'A'..'Z') ('a'..'z'|'A'..'Z'|'0'..'9'|'_')* ;
protected SCHEME : ( "http" | "file" | "pl" ) ;
URL : SCHEME ':' (~('
'))* ;


clearly there is lexical non-determinism unless k is > 4, but the actual lex code will fail to match the valid IDs 'hile', 'htle', 'htte', ... because of two things:

  • the generated if guarding the mURL() call only tests as many characters as there are in the shortest SCHEME string, and
  • the code tests to see if the ith character matches any of the ith characters of each of the SCHEME strings


what does this mean? It means that the test doesn't look for the ':' which is the real disambiguator, and that it tries to call mURL() on strings that have, say, the ith letter of "http" and the jth letter of "file" (i!=j).

michael

PS Is there somewhere to formally report bugs?

PPS I couldn't find any detailed list of bugfixes that have been made for versions later than 2.7.2 so I've no idea if this has already been fixed.    <<Less
About | Sitemap | Contact