# Class 14: Parsing, concluded

Held Wednesday, September 30, 1998

Notes

## LR parsing, concluded

### Construction of LR(0) automata, revisited

• Recall that we build LR(0) automata using the following guidelines
• A state in an LR(0) automaton is a collection of LR(0) items
• An LR(0) item is a production augmented by a ``position marker'' that indicates where in the production we might be.
• The grammar is augmented with a new start state, S', and production,
```  S' ::= S \$
```
• The initial state is the closure of `S' ::= . S \$`
• The closure of a state fills in the other possible items. For example, if we're waiting to see an N (indicated by a position marker before N in some item), then we need to include all the items for ``how to parse an N, given that you've seen nothing of the right-hand side''.
• The edge labels of the automaton are given by the nonterminals and terminals of the grammar.
• The edges of the automaton correspond to seeing one more symbol in the right-hand sides. After moving the position marker appropriately, one computes the closure of the new items.
• The final states are the states in which the position marker has reached the end of the right-hand side.
• When reaching a final state in some contexts, one does the reduction given by the final state, returns to the state in effect when we started matching the right-hand side, and restarts the automaton.
• A stack keeps track of the state history.

### Conflicts in LR automata

• At times, there are conflicts in LR automata. What kinds of conflicts?
• A state may include a ``final'' item (one in which the position marker is at the end) and a nonfinal item.
• This is called a shift-reduce conflict
• A state may include two different ``final'' items.
• This is called a reduce-reduce conflict
• Can we have reduce-reduce conflicts in unambiguous grammars?
• Can we have a shift-shift conflict?

### SLR Automata

• You may have noted (e.g., from our example yesterday) that LR(0) automata can be overly aggressive in choosing to reduce.
• Such automata typically reduce whenever we've reached the end of a right-hand side.
• SLR automata only reduce when the next token is in `Follow` of the left-hand side of the reduction.

### LR(1) Automata

• LR(1) automata require a more complicated construction process, one that involves lookahead.
• In effect, instead of using the Follow table (as SLR automata do), LR(1) automata build more specific follow tables that correspond to the possible follow symbols according to a particular context.
• Each LR(1) item contains not just an augmented production, but also a token that can follow the nonterminal when we've reached the current state.
• The tokens are inserted by the closure routine.
• If we have N ::= alpha . M beta then when we insert the M items, we indicate that each of them can be followed by the tokens in first(beta)
• If beta is nullable, then the M items can also be followed by whatever can follow N (in the given LR(1) item).
• See the book and the errata sheets for more details.

