# Class 16: Predictive Parsing (2)

Back to Predictive Parsing (1). On to Predictive Parsing (3).

Held: Wednesday, 25 February 2004

Summary: Today we continue our exploration of predictive parsing by looking at steps for automating the construction of predictive parsers.

Related Pages:

Notes:

• Interesting panel tomorrow in south lounge from 11:00-12:45. Free lunch!
• Cool parallel computing talk today at 4:30.
• Are there questions on the project?
• Upon reflection, there is no midterm! (There will, however, be a written assignment.)

Overview:

• Helpful tables: First, Follow, and Nullable
• Building the `First` table.
• Building the `Nullable` table.
• Building the `Follow` table.

## Analyzing the Grammar

• As you can tell, we need to make a number of decisions when we convert a grammar into a predictive parser.
• Given multiple right-hand sides that start with a nonterminal, which one do you choose?
• If a nonterminal derives epsilon, how do you decide to apply that derivation?
• What if a nonterminal derives a nonterminal that derives epsilon? Consider what to do when you see a `b` when matching an `S` in
```S ::= X b
S ::= c
X ::= x X
|  epsilon
```
• All of these analyses are aided by three tables (and corresponding functions) known as First (and first), Follow, and Nullable (and nullable).
• The tables and functions are mutually recursively constructed. That is, the functions may be defined in terms of the tables and the tables may be defined in terms of the current versions of the functions.
• First maps nonterminals to sets of tokens that can begin strings derived from those nonterminals.
• first maps sequences of nonterminals and terminals to the symbols that can begin strings derived from those sequences
• Follow maps nonterminals to sets of tokens that can follow those nonterminals in sequences derivable in the grammar
• There is no follow function as it is neither interesting nor useful.
• Nullable indicates which nonterminals can derive the empty string
• nullable operates on sequences of symbols (nonterminals and terminals), returning true only when the sequence can derive epsilon

## Building the First table and first function

• How do we define the first function?
• We know that if a sequence begins with a token, then it can only derive strings that begin with that token.
• But what if the sequence begins with a nonterminal?
• Then the sequence can derive strings that begin with any token that begins strings that the nonterminal can derive.
• What what if the nonterminal can derive the empty string?
• Then we need to look further in the original sequence.
• Putting it all together
```TokenSet first(SymbolList string) {
Symbol s1 = string.car();
if (isToken(s1)) {
return new TokenSet(s1);
}
else {
tmp = First[s1];
if (Nullable[s1]) {
return new Union(tmp,
first(string.cdr()));
} // if s1 is nullable
} // if s1 is a nonterminal
} // first
```
• Note that this relies on the First table and the Nullable table.
• How do we build the First table? Using the first function.
```  for each nonterminal N, First(N) = emptySet();
repeat
for each production N ::= RHS
First(N) = union(First(N), first(RHS))
end for
```
• How do we know this terminates?

## Building the Nullable table and the nullable function

• The nullable function is relatively easy to define
```boolean nullable(SymbolList string) {
// The empty string can derive the empty string
if (String.isEpsilon()) {
return true;
}
Symbol s1 = string.car()
// If we begin with a token, we're not nullable
if (isToken(s1)) {
return false;
}
// Otherwise, we begin with a nonterminal.  That nonterminal must
// be nullable, as must the rest of the sequence.
else {
return Nullable[s1] &&
nullable(string.cdr());
}
} // nullable
```
• And, once again, we can define the Nullable table in terms of the nullable function.
```  foreach nonterminal N, Nullable(N) = false
repeat
for each production N := RHS
if nullable(RHS) {
Nullable[N] = true;
}
end for
This document may be found at `http://www.cs.grinnell.edu/~rebelsky/Courses/CS362/2004S/Outlines/outline.16.html`.