What is a "protected" lexer rule?
What token definitions result in token objects that get sent to the parser? The answer you'd expect or the one you're used to is, "You get a Token object for every lexical rule in your lexer grammar." This is indeed the default case for ANTLR's lexer grammars.
What if you want to break up the definition of a complicated rule into multiple rules? Surely you don't want every rule to result in a complete Token object in this case. Some rules are only around to help other rules construct tokens. To distinguish these "helper" rules from rules that result in tokens, use the protected modifier. This overloading of the access-visibility Java term occurs because if the rule is not visible, it cannot be "seen" by the parser.
I now recognize this approach as a mistake. I have a number of other
proposals to fix this, none that seems to satisfy everyone.
class L extends Lexer;
/** This rule is "visible" to the parser
* and a Token object is sent to the
* parser when an INT is matched.
INT : (DIGIT)+ ;
/** This rule does not result in a token
* object that is passed to the parser.
* It merely recognizes a portion of INT.
DIGIT : '0'..'9' ;
By definition, all lexical rules return Token objects (ANTLR optimizes away many of these object creations, however), but only the Token objects of non-protected rules get pulled out of the lexer itself.