# context clash

## context clash

(grammar)When a parser cannot tell which alternative
production of a syntax applies by looking at the next
input token ("lexeme").

E.g. given syntax

C -> A | b c

A -> d | b e

If you're parsing non-terminal C and the next token is 'b', you don't know whether it's the first or second alternative of C since they both can start with b.

To discover whether a grammar has a context clash:

For each non-terminal, N, with multiple alternatives, look at the first symbol of each alternative's right-hand side, call it s. If s is the empty string, then find the set FOLLOWER(N) otherwise find the set FIRST*(s). If any of the sets for N's alternatives intersect then there will be a context clash when parsing N. If the next input symbol is one of those in the intersection of two sets then you won't know which of the alternatives applies.

FIRST(s) is the set of symbols with which s can start, including s itself. If s is a non-terminal then FIRST(s) also includes the first symbol of each alternative right-hand side of s. The '*' in FIRST*(s) means the "transitive closure" of FIRST which means keep applying FIRST to each element of the result until the result doesn't change. I.e. start with just the set R = s, then for each non-terminal x in R, add FIRST(x) to R. Keep doing this until nothing new is added. (We are really only interested in the terminals in FIRST*(s) but some definitions include the non-terminals).

FOLLOWER(N) is the set of symbols which can come after N in a sentence. Find each occurrence of N on the right-hand side of a rule, e.g.

M -> ... | ... N ... | ...

If there is a symbol s immediately following N then add FIRST*(s) to the result (again, we're only interested in the terminal symbols in FIRST*(s)) if there is no symbol after N in the alternative then add FOLLOWER(M) to the result (i.e. if N can be the last symbol in an M then anything that can follow M can also follow N).

If a grammar can generate the same sentence in multiple different ways (with different parse tress) then it is ambiguous. An ambiguity must start with a context clash (but not all context clashes imply ambiguity). The context clash occurs when trying to parse the first token of the phrase with multiple parses - you will not be able to tell which alternative to take. To see if a context clash is also a case of ambiguity you would need to follow the alternatives involved in each context clash to see if they can generate the same complete sequence of tokens.

E.g. given syntax

C -> A | b c

A -> d | b e

If you're parsing non-terminal C and the next token is 'b', you don't know whether it's the first or second alternative of C since they both can start with b.

To discover whether a grammar has a context clash:

For each non-terminal, N, with multiple alternatives, look at the first symbol of each alternative's right-hand side, call it s. If s is the empty string, then find the set FOLLOWER(N) otherwise find the set FIRST*(s). If any of the sets for N's alternatives intersect then there will be a context clash when parsing N. If the next input symbol is one of those in the intersection of two sets then you won't know which of the alternatives applies.

FIRST(s) is the set of symbols with which s can start, including s itself. If s is a non-terminal then FIRST(s) also includes the first symbol of each alternative right-hand side of s. The '*' in FIRST*(s) means the "transitive closure" of FIRST which means keep applying FIRST to each element of the result until the result doesn't change. I.e. start with just the set R = s, then for each non-terminal x in R, add FIRST(x) to R. Keep doing this until nothing new is added. (We are really only interested in the terminals in FIRST*(s) but some definitions include the non-terminals).

FOLLOWER(N) is the set of symbols which can come after N in a sentence. Find each occurrence of N on the right-hand side of a rule, e.g.

M -> ... | ... N ... | ...

If there is a symbol s immediately following N then add FIRST*(s) to the result (again, we're only interested in the terminal symbols in FIRST*(s)) if there is no symbol after N in the alternative then add FOLLOWER(M) to the result (i.e. if N can be the last symbol in an M then anything that can follow M can also follow N).

If a grammar can generate the same sentence in multiple different ways (with different parse tress) then it is ambiguous. An ambiguity must start with a context clash (but not all context clashes imply ambiguity). The context clash occurs when trying to parse the first token of the phrase with multiple parses - you will not be able to tell which alternative to take. To see if a context clash is also a case of ambiguity you would need to follow the alternatives involved in each context clash to see if they can generate the same complete sequence of tokens.

Want to thank TFD for its existence? Tell a friend about us, add a link to this page, or visit the webmaster's page for free fun content.

Link to this page: