Posted By:
Don_McClean
Posted On:
Tuesday, January 29, 2002 01:44 PM
I am using ANTLR for parsing a large verbose data structure language that was defined as an external standard. It is line oriented. Many of the 400 unique keywords are very similar, for example the following words all start with the word START_ (hence I have left factored out START_). protected START_ : "START_"; START_BARCODE : (START_ "BARCODE")! DATA_VALUE_NL ; START_DATA_MANIPULATION : (START_ "DATA_MANIPULATION")! DATA_VALUE_NL ; . . START_TITLE : (START_ "TITLE")! DATA_VALUE_NL ; START_VENDOR_INFO : (START_ "VENDOR_INFO")! DATA_VALUE_NL ; I am having a problem even wi
More>>
I am using ANTLR for parsing a large
verbose data structure language that was defined as an external standard. It is line oriented. Many of the 400 unique keywords are very similar, for example the following words all start with the word START_ (hence I have
left factored out START_).
protected
START_ : "START_";
START_BARCODE : (START_ "BARCODE")! DATA_VALUE_NL ;
START_DATA_MANIPULATION : (START_ "DATA_MANIPULATION")! DATA_VALUE_NL ;
.
.
START_TITLE : (START_ "TITLE")! DATA_VALUE_NL ;
START_VENDOR_INFO : (START_ "VENDOR_INFO")! DATA_VALUE_NL ;
I am having a problem even with the
factoring with a lookahead of 5, as
the code generated in 'nextToken' does
not distinguish between the two,
even though I left factored:
else if ((LA(1)=='S') && (LA(2)=='T') && (LA(3)=='A') && (LA(4)=='R') && (LA(5)=='T')) {
mSTART_DEFECT_DEFINITION(true);
theRetToken=_returnToken;
}
else if ((LA(1)=='S') && (LA(2)=='T') && (LA(3)=='A') && (LA(4)=='R') && (LA(5)=='T')) {
mSTART_DEFECT_MEASUREMENTS(true);
theRetToken=_returnToken;
}
If I set the lookahead to 7, the
java compiler or jre apparently cannot handle the code generated, as I get a runtime error message:
java.lang.VerifyError: (class: com/ti/rts/parser/SemiFileLexer, method: nextToken signature: ()Lantlr/Token;) Illegal instruction found at offset 409
at com.ti.rts.parser.RunParser.main(RunParser.java:23)
Has any one seen anything like this?
Or has any suggestions?
Should I perhaps create my own lexer? Since this is a line oriented
language and is simple to recognize the
keyword at the beginning of each line,
I could pass the token to the parser.
I would appreciate any suggestions.
<<Less