 What does LL(k) means?

John Zukowski

The first letter, L for left to right or R for right to left, denotes the order in which to scan the source text.

The second letter denotes the derivation of the constructed parse tree. At each step in a left (L) derivation of a parse tree, the leftmost non-terminal in the tree is replaced by a terminal. In a right (R) derivation, the rightmost non-terminal is replaced. Top-down parses that scan input from left to right producing a left derivation. Bottom up parses that scan the symbols left to right produces a right derivation.

The k denotes the number of lookahead symbols needed by the parser. With k = 1, only the next symbol is needed.

Terence adds: those are the correct technical definitions. In practice LL(k) means you are building a recursive-descent parser. Recursive-descent parsers have a method for each rule in the grammar and proceed are predictive in that each rule has to predict which alternative will match. The parse proceeds from a start symbol and routes down into the various other rules depending on the lookahead. LL(k) implies up to k symbols of lookahead are visible to the parser for some fixed integer k. [Note that syntactic predicates allow you to effectively modulate k from one symbol to the entire rest of the input.]

Check out a simple LL(1) rule:

a : A
| B
;

Here, rule 'a' (or method a()) must guess which alternative will be succesful. Using the lookahead (which will be either A or B), it decides.

a() {
if ( lookahead==A ) { match(A); }
else { match(B); }
}

LL(1) parsers may be generalized to LL(k) for k>1. For example, the following grammar is nondeterministic upon token A:

a -> A B C
a -> A D E

where A, B, C, D, and E are some vocabulary tokens. Because both productions have a common prefix of A, an LL(1) parser could not determine which production was going to successfully match. However, if the parser could see ahead to both the A and what followed A on the input stream, the parser could determine which production was going to match. An LL(2) parser is such a creature; hence, rule a is unambiguous in the LL(2) sense. A grammar for which a deterministic LL(k) parser can be built is LL(k). A language for which an LL(k) grammar exists is LL(k).