How do you restrict assignments to semantically valid statements? In other words, how can I ignore syntactically valid, but semantically invalid sentences?

Ian Kaplan

For a complex language like C semantic checking must be done either after the statement is recognized in the parser or in a later pass. Semantic checking will usually involve creation of a symbol table. The semantic check verifies that the assignment is allowed to an object of that type. For example, in the C code below a value is assigned to a typedef name. The assignment is syntactically legal, but semantically illegal since it makes no sense. The fact that mystruct is a typedef name would be resolved from the symbol table and a semantic error would be reported, perhaps in a semantic checking pass.

typedef struct {
  int x, y, z;
} mystruct;

   mystruct = 42;

Although C is simple enough that a parser could be created with embedded semantic checking, this is not practical for a language like Java, which requires at least one separate semantic pass.

Another part of this question touches on the limits of a parser generated from a grammar of a managable size. The grammar may recognize assignment statements that are infact syntaticly illegal. By expanding the grammar the parser could catch these errors. However, as the grammar gets larger it becomes more difficult to understand and maintain. If the grammar is expanded to catch all syntaticly illegal statements, it will become unmanagable, at least for complicated languages like Java or C++.

For example, a Java grammar might accept valid syntax like

        x = y + z


        p += q

But it might also accept a statement like

         ((val) ? x : y) += z

which is syntactically incorrect in Java. These errors must be caught in the semantic analysis pass.

[Semantic validation should be done in a separate phase for "real" languages, however, semantic predicates can often be used within a grammar to reject blatantly ridiculous semantic constructs, saving you some effort later. Or, for small languages, you might get away with doing all semantic validation with semantic predicates. -Terence]