Why do syntactic predicates sometimes lead to bizarre or distant error messages?

Sinan Karasu

To illustrate the problem, let's take a subset of fortran. Say the lexer returns a character at a time, eating up the white space.

    :  D O (DIGIT)+ id EQ num COMMA num
    |  id EQ expr
    |  C O N T I N U E
This requires large lookahead so I put a couple syntactic predicates.
     (D O (DIGIT)+ id EQ num COMMA num)=> ...
     | ( id EQ expr)=> ...
    | C O N T I N U E
And now we throw the following statement at it:
 d o 50 i= j.k (notice that it is a .(period) not a ,(comma))
and get
syntax error : CONTINUE expected.
That is because the parser is out of alternates and a syntactic predicate on the last alternative makes no sense since there is nother further to try after "C O N T I N U E".

Couple of ways to get around is to do micromanagement and use predicates in subclauses. But then the whole syntax gets messy.

Another method is to put an alternate that will never be satisfied (maybe a nonexistant token such as NoSuchToken as the last alternate and say "unclassifiable statement". However it is debatable if this really is an improvement.

TJP adds: Also, if you happen to be using lexical input filtering (i.e., you have a rule that gobbles up bogus lexical structures), then an effective way to commit to a particular alternative is to use the setCommitToPath() method.