# Class 15: Shift-Reduce Parsing

Back to Predictive Parsing, Continued. On to Shift-Reduce Parsing, Continued.

Held Friday, February 23, 2001

Summary

Today we consider an alternate form of parsing, shift-reduce parsing. Shift-reduce parsing can be applied to many more grammars than can predicitve parsing.

Notes

• A number of you have asked for extensions on homework 1. It's okay with me if it's okay with Rachel.
• I'm sure I'll hear from Rachel if I just put too much pressure on her.
• The yacc lab is due next Thursday.

Overview

• Recursive-descent parsing, conclude
• Building parsers from tables
• Problems with prediciate parsers
• Alternatives?
• Shift-reduce parsers
• An example
• LR(0) parsers: basic shift-reduce parsers

## Building a recursive descent parser, revisited

• We can now use these tables to build recursive descent parsers for selected languages.
```for each nonterminal N, build the method parseN as follows
for each production, N ::= Stuff
for each symbol in first(Stuff)
add a case for that symbol, using the steps to parse Stuff
end for
if nullable(Stuff) then
for each symbol in Follow(N)
add a case for that symbol, using the steps to parse Stuff
-- Note that we still parse Stuff, even thoush we
-- ``know'' it nullifies
end for
end if
end for each production
end for each nonterminal
```

## Problems with Predictive Parse Tables

• As you may know, there are a number of problems with this strategy.
• In particular, in building the parse function for N we may
• Assign two right-hand-sides to the same symbol because both have that symbol in their first sets
• Assign two right-hand-sides to the same symbol because one has that symbol in its first set and the other is nullable and that symbol appears in N's Follow set.
• Assign two right-hand-sides to the same symbol because both are nullable and that symbol appears N's Follow set.
• Do some combination of the above.

### Left-recursive grammars

• Left-recursive grammars are a particular problem. These are grammars which have a rule of the form
```N ::= N beta
```
• Consider
```A ::= A a
A ::= b
```
• When we see a b, what should we do?
• Note that any left-recursive grammar will have a similar problem, since anything that can begin a string derivable form the nonterminal can begin the left-recursive right-hand side.
• We can make such grammars right-recursive, although they then become less intuitive.
• In this case, the grammar represents one `b` followed by zero or more `a`'s, so we can rewrite that as
```A ::= b A'
A' ::=
|  a A'
```

### Left Factoring

• At times, we have two right-hand-sides that begin with the same symbols.
• For example,
```N ::= a A
|  a B
```
• Often, we can factor out the common part to allow us to choose between the rules.
• For example,
```N ::= a N'
N' ::= A
|  B
```
• The problem gets harder when we have different nonterminals that can begin with the same letter.
```N ::= A A
|  B B
A ::= a A
|  c
B ::= a
|  b B
```

## Alternative Techniques

• As we've just seen, there are some significant disadvantages to recursive descent parsers.
• Some common grammars seem to require an arbitrary amount of lookahead (e.g., the standard expression grammar).
• Recursive descent often requires rewriting the original grammar, the rewritten grammars may be significantly less natural.
• Hence, compiler writers often look to other techniques.
• Note that recursive descent parsers are, in effect, top-down (you start with the start symbol and attempt to derive the string).
• We can gain some power by starting at the bottom and working our way up.

## Shift-Reduce Parsing

• The most common bottom-up parsers are the so-called shift-reduce parsers. These parsers examine the input tokens and either shift them onto a stack or reduce the top of the stack, replacing a right-hand side by a left-hand side.
• For example, suppose we have the expression 2 + 3 * 4 + 5.
• We shift 2 onto the stack:
[2]
• We reduce the 2 on the stack to a Factor:
[Factor]
• We reduce the Factor on the stack to a Term:
[Term]
• We reduce the Term on the stack to an Expression:
[Expression]
• We shift the plus onto the stack
[Expression PLUS]
• We shift the 3 onto the stack:
[Expersion PLUS 3]
• We reduce the 3 to a Factor:
[Expression PLUS Factor]
• We reduce the Factor to a Term:
[Expression PLUS Term]
• We shift the multiplication symbol onto the stack:
[Expression PLUS Term TIMES]
• We shift the 4 onto the stack:
[Expression PLUS Term TIMES 4]
• We reduce the 4 to a Factor:
[Expression PLUS Term TIMES Factor]
• We reduce the Term TIMES Factor to a Term:
[Expression PLUS Term]
• We reduce the Expression PLUS Term to an Expression:
[Expression]
• We shift the plus onto the stack:
[Expression PLUS]
• We shift the number onto the stack
[Expression PLUS 5]
• We reduce the 5 to a Factor:
[Expression PLUS Factor]
• We reduce the Factor to a Term:
[Expression PLUS Term]
• We reduce the Expression PLUS Term to an Expression:
[Expression]
• We reduce the expression to the start symbol.
• Note that it seems that we made different decisions in the same context. But the context is not the same. Decisions are based on what is on the stack and what the next input tokens are.
• In some ways, these parsers can be viewed as finite automata that use a stack (also known as push-down automata).
• Shift reduce parsing is traditionally done with LR(k) parsers. The first L stands for ``left-to-right traversal of the input'', the next R stands for ``rightmost derivation'' and the k stands for ``number of characters of lookahead''.
• You should be able to understand the traversal and lookahead.
• The rightmost refers to the types of derivations that the grammar represents. In a rightmost derivation from the start symbol, you always replace the rightmost nonterminal.
• While LR parsers are bottom up, they simulate rightmost top-down derivations

## History

Monday, 22 January 2001

• Created as a blank outline.

Friday, 23 February 2001

Back to Predictive Parsing, Continued. On to Shift-Reduce Parsing, Continued.

Disclaimer: I usually create these pages on the fly. This means that they are rarely proofread and may contain bad grammar and incorrect details. It also means that I may update them regularly (see the history for more details). Feel free to contact me with any suggestions for changes.

This page was generated by Siteweaver on Mon Apr 30 10:51:56 2001.
This page may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2001S/outline.15.html`.