# Class 11: Predictive parsing

Held Monday, September 21, 1998

Notes

• New assignments: assignment 5: parser and assignment 6: parsing
• Tomorrow at 11am is the cool physics talk.
• Tomorrow at 3pm they'll be demonstrating SMARTboards in the MathLAN. While the demonstration is primarily for faculty, you are welcome to come, see, and comment.
• Next Tuesday at 4:15 Yuriy will be presenting a cool talk on his summer research.
• Sunday the 27th at noon is the math/cs picnic. Come and have good food.
• The legedandary Putman exam will be given on December 5 in two three-hour parts, one in the morning and one in the afternoon. The exam is constructed to test originality and technical competence and to appeal to students who like to do challenging math problems. If you're interested in learning more about the exam, contact Dr. Adelberg.
• Reminder: there is no class this Friday.
• I've been given some brochures about student memberships for the Association for Computing Machinery. Student memberships are relatively cheap (\$35/year) and well worth it. At least one brochure claims that they can help with career advice and internships. It might also behoove us to form a student chapter.
• I've updated the due date on assignments 3 and 4. Any questions on those assignments?

## Recursive descent parsing, revised

• Let us return to the question of how you turn a grammar into a recursive descent parser.
• Basically, each nonterminal gets changed into a method with the following form
```function parseN(TokenList tl) {
case S1:
code for rhs one;
break;
case S2:
code for rhs two;
break;
...
default:
error();
}
} // parseN
```
• In general, the `S`'s should be the symbols that start the right-hand-sides.
• But what if a right-hand-side is epsilon?
• Then it's safe to reduce to epsilon if we see something that can come after our nonterminal
• And what if a right-hand-side begins with a nonterminal?
• Then we need to do more sophisticated analyses to determine what can begin the right-hand-side
• And what if two right-hand-sides can begin with the same symbol?
• Then we need to change our grammar

## First, Follow, and Nullable

• All of these analyses are aided by three tables and corresponding functions, known as First (and first), Follow, and Nullable (and nullable).
• The tables and functions are recursively constructed.
• First maps nonterminals to sets of tokens that can begin strings derived from those nonterminals.
• first maps sequences of nonterminals and terminals to the symbols that can begin strings derived from those sequences
• Follow maps nonterminals to sets of tokens that can follow those nonterminals in sequences derivable in the grammar
• Nullable indicates which nonterminals can derive the empty string
• nullable operates on sequences of symbols (nonterminals and terminals), returning true only when the sequence can derive epsilon

### Building the First table and first function

• How do we define the first function?
• We know that if a sequence begins with a token, then it can only derive strings that begin with that token.
• But what if the sequence begins with a nonterminal?
• Then the sequence can derive strings that begin with any token that begins strings that the nonterminal can derive.
• What what if the nonterminal can derive the empty string?
• Then we need to look further in the original sequence.
• Putting it all together
```function first(s1 ... sn) {
if (isToken(s1)) {
return setOf(s1);
}
else {
tmp = First(s1);
if (Nullable(s1)) {
return union(tmp, first(s2 ... sn))
}
}
} // first
```
• Note that this relies on the First table and the Nullable table.
• How do we build the First table? Using the first function
```  for each nonterminal N, First(N) = emptySet();
repeat
for each production N ::= RHS
First(N) = union(First(N), first(RHS))
end for
```

### Building the Nullable table and the nullable function

• The nullable function is relatively easy to define
```function nullable(s1 ... sn) {
// The empty string can derive the empty string
if (n == 0) {
return true;
}
// If we begin with a token, we're not nullable
if (isToken(s1)) {
return false;
}
// Otherwise, we begin with a nonterminal.  That nonterminal must
// be nullable, as must the rest of the sequence.
else {
return Nullable(s1) &&
nullable(s2..sn);
}
} // nullable
```
• And, once again, we can define the Nullable table in terms of the nullable function.
```  foreach nonterminal N, Nullable(N) = false
repeat
for each production N := RHS
if nullable(RHS) {
Nullable(N) = true;
}
end for
```

Back to An Expression Grammar. On to Predictive parsing, continued.

History

• Created Monday, September 21, 1998. Parts of the body were taken from the outline of the previous class.
• On Wednesday, September 23, 1998, parts of the body were removed and moved into the subsequent outlines.
• On Monday, September 28, 1998, the section on the Follow table was removed.

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.