Compilers (CSC-362 98F)


Solutions to the Final Examination

This page may be found online at http://www.math.grin.edu/~rebelsky/Courses/CS362/98F/Handouts/finalsoln.html

These are my solutions to the various problems on the final examination, along with some notes as to grading and observations during grading of your examinations. Like many of my exams, this was perhaps longer than it should have been. However, it is also likely that many of you would have done worse on a shorter exam, since a shorter exam would probably have consisted only of problem 3.

While I am unwilling to provide a list of grades for the individual exams, here are the grades for the various problems.

Because there was a wider numeric range than normal, I did scale the grades to provide letter grades (and it is the letter grade that gets averaged in for your final grade).


Problem 1: Shorter Questions [40 points]

Answer any four of the following questions. Indicate which ones you have answered. Most of these are relatively short questions (particularly as compared to my typical questions), especially if you have done some of the recommended extra work I've assigned during the semester.

If you choose to answer more than four, I will give you a small amount of extra credit for correct answers to extra problems, but you must indicate which ones I should grade for primary credit.

Notes

As you can tell from the grade summaries above, some of these were easier (or at least more doable by those who chose to do them). In most ``select N of M'' problems, part of the goal is selecting the ones that you'll do better on.

Problem 1.A: Optimizing DFAs

We've seen two strategies for optimizing deterministic finite automata:

Will these algorithms perform identically (i.e., will they give the same results on all DFAs)? Is one better? If so, how?

You may want to consider the following automaton.

States: q0, q1, q2, q3
Start state: q0
Final states: q1, q3
Transitions:
  q0,0 -> q1
  q0,1 -> q3
  q1,0 -> q2
  q1,1 -> q0
  q2,0 -> q3
  q2,1 -> q1
  q3,0 -> q0
  q3,1 -> q2

An Answer

The two algorithms are not identical, as the example automaton illustrates. The merge algorithm does not successfully optimize in every case, while the split algorithm does. For example, it may be the case that there are two states that are identical, but we only realize that by noting that two other states are identical. In this sample automaton, the merge algorithm cannot do any merging, because no two states are identical. However, the split algorithm creates the sets {q0,q2} and {q1,q3} and then needs to split neither set.

With some hand-waving, we might say that the split algorithm is optimal since it only adds a state (by splitting) when it is necessary that we do so.

Notes

It's likely that the example helped most of you, since almost everyone used it as a separating case.

Few of you attempted to suggest anything about more general cases, particularly whether either automaton was optimal, or what kinds of automata they might fail on.

Interestingly, I saw two opposite arguments in terms of coding. One of you claimed that the ``join equivalent states'' algorithm would be easier to implement. Another claimed that the ``split different states'' would be easier. I'd guess that they're about equivalent, depending on how much skill you have at implementing appropriate types of sets.

Problem 1.B: Cyclic Type Definitions

The Tiger language definition states that every cycle of type definitions must go through a record or array. But if the compiler forgets to check for this error, nothing terrible will happen. Explain why. If you believe that something terrible will happen, you can instead suggest what will happen and why it is terrible.

This is a modified version of problem 5.3 from Appel (p. 129 of the Red book, p. 124 of the Green book).

An Answer

If there is a cycle of type definitions, such as A being a name for B and B being a name for A, then there will be no base type for any of the types in the cycle. Without a base type, there is no way we can initialize variables in the type, so we can never declare variables.

There is a concern that during type checking we will follow cycles infinitely. In the example above, if we declare a variable a of type A and assign it the value 1, we will check whether A is equivalent to integer, following its indirection link, giving us B. When checking whether B is equivalent to integer, we will follow its indirection link, giving us A. Around and around we'll go, when we'll stop, nobody knows.

Problem 1.C: Commutes

Recall that statement s and expression e commute if rearranging the order of evaluation of s and e does not affect the outcome of either operation. In chapter eight, Appel suggests some simple criteria that permit us to determine some of the cases in which s and e commute. Give at least two others.

Appel's Original

In Appel's original version, s and e commute if

An Answer

Notes

This was by far the worst of the problems (at least in terms of your answers). I would guess that is, in part, because Appel does not carefully define commute. (I also gave a less-than-perfect description; I probably should have said ``rearranging the order does not affect the outcome of the program''.)

A number of you used the word variable in your descriptions. There are no variables in intermediate representation trees, only memory locations and temporaries. As we discussed in class, determining when two memory locations are equivalent may not be possible at this time. For example, is M[T100] the same as M[T23]?

Few of you considered deep trees. You assumed that both parts had only a few nodes. However, if all you say is ``has a BINOP at the root'', then you have not eliminated the possibility that the BINOP has fairly complex arguments.

Many of you had problems with LABEL. It is extremely dangerous to move a LABEL, as it means that a jump to the label will act differently. I did not penalize you for neglecting to mention LABEL. However, if you mentioned it and got it wrong, then you lost points.

Problem 1.D: Instruction Selection

Consider a machine with the following instruction

mult const1(src1), const2(src2), dst3

with the meaning

r3 <- M[r1 + const1] * M[r2 + const2]

On this machine, r0 is always 0 and M[1] always contains 1.

a. Draw all of the tree patterns that correspond to this instruction and its special cases.

b. Pick one of the bigger patterns and show how to write an if statement (in Java, C, or Tiger) that matches it. Make sure to show the IRT representation of the statement.

This problem is slightly modified from problem 9.2 of Appel (p. 220 of the Red book, p. 217 of the Green book).

Some Answers

There are more answers here than most of you got. I did not penalize you for getting fewer, as long as you caught most of the interesting cases.

Note that the @ signs are ``register holes'' in the pattern and the underscores are ``number holes''

Here's the full statement.

MOVE(@, BINOP(*, MEM(BINOP(+,@,CON(_))), MEM(BINOP(+,@,CON(_)))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
    BINOP       BINOP
    / | \      /  |  \
   +  @ CON   +   @  CON
         |            |
         _            _

Note that we can swap the arguments to the lower BINOPs, giving three rearrangements of the tree.

MOVE(@, BINOP(*, MEM(BINOP(+,CON(_),@)), MEM(BINOP(+,@,CON(_)))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
    BINOP       BINOP
    / | \      /  |  \
   + CON @    +   @  CON
      |               |
      _               _
and
MOVE(@, BINOP(*, MEM(BINOP(+,@,CON(_))), MEM(BINOP(+,CON(_),@))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
    BINOP       BINOP
    / | \      /  |  \
   +  @ CON   +  CON  @
         |        |     
         _        _

MOVE(@, BINOP(*, MEM(BINOP(+,CON(_),@)), MEM(BINOP(+,CON(_),@))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
    BINOP       BINOP
    / | \      /  |  \
   + CON @    +  CON  @ 
      |           |
      _           _

The constants can be 0, so we can also do

MOVE(@,BINOP(*,MEM(@),MEM(@)))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
        @       @
and
MOVE(@, BINOP(*, MEM(@), MEM(BINOP(+,@,CON(_)))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
        @       BINOP
               /  |  \
              +   @  CON
                      |
                      _
and
MOVE(@, BINOP(*, MEM(BINOP(+,@,CON(_))), MEM(@)))
    MOVE
   /    \
  @    BINOP
      /  |  \
     * MEM   MEM
        |      \
    BINOP       @
    / | \      
   +  @ CON   
         |   
         _  

We can make either multiplicand be 1 by using MEM(R(0)+CON(1)). This then becomes a simple load instruction (expressed in a more complicated format).

MOVE(@, MEM(BINOP(+,@,CON(_))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     +   @  CON
             |
             _
By making the constant 0, we can do a different load.
MOVE(@, MEM(_))
    MOVE
   /    \
  @     MEM
         |
         @

If we want, we can also do ``set to 1'' by using 1 for both.

MOVE(@, CONST(1))
    MOVE
   /    \
  @    CONST
         |
         1

We can also work with particular locations in memory by using R(0).

MOVE(@, MEM(CONST(_)))
    MOVE
   /    \
  @     MEM
         |
       CONST
         |
         _
We can also multiply values at two locations using the same trick.
MOVE(@, BINOP(*, MEM(CONST(_)), MEM(CONST(_))))
    MOVE
   /    \
  @    BINOP
      /  |  \
     *  MEM MEM
         |   |
        CON CON
         |   |
         _   _

The creation of the conditional was a little bit more subtle. My impression was that Appel wanted a conditional whose natural translation was into one of these trees, not which contained one of these trees. Nonetheless, I accepted one that just contained the trees.

Problem 1.E: Register Allocation

Find a graph that, for some K >= 3, is K colorable without spilling, but requires spilling in some invocation of the Optimistic Graph Coloring algorithm given in Section 11.1.

Note that you can interpret ``color'' and ``select a node to spill'' in any legal way you deem appropriate.

An Answer

Note that we did one of these in class. (Or at least one fairly simlar.) The example takes advantage of our ability to color with any color we want.

Edges: (A,B), (A,C), (A,D), (A,E), (B,C), (C,D), (D,E), (B,E)

     A ----+
   / | \   |
  B--C--D--E
  |        |
  +--------+

Three coloring:

On to the algorithm:

Problem 1.F: Student Presentations

Write a short question and answer based on another group's presentation. You will be graded on the quality of your question (which should be of comparable difficulty to the other short questions) as well as the quality of your answer. Note that I do not like simple memorization/definition questions.


Problem 2: Back End [20 points]

Consider the following program written as a series of IRT-like instructions in which the conditional branch has two destinations. Assume that readInt has been defined elsewhere and that it reads an integer and returns the integer read in register RV.

CALL printInt 10
LABEL L40
CALL readInt
MOVE T(100) RV
JUMP L(L50)
LABEL L51
CALL readInt
MOVE T(102) RV
CALL readInt
MOVE T(103) RV
CJUMP LT L(L52) L(L53)
LABEL L52
JUMP L(L53)
LABEL L50
CALL readInt
MOVE T(101) RV
JUMP L(L51)
LABEL L53
MOVE T(200) MAXINT	# The maximum integer
MOVE T(201) MAXINT	# The maximum integer
LABEL L200
MOVE T(300) T(100)
MOVE T(555) L(L201)
JUMP L(L300)
LABEL L201
MOVE T(300) T(101)
MOVE T(555) L(L202)
LABEL L202
MOVE T(300) T(102)
MOVE T(555) L(L203)
JUMP L(L300)
LABEL L203
MOVE T(300) T(103)
MOVE T(555) L(L204)
JUMP L(L300)
LABEL L204
JUMP L(L400)
LABEL L300
CJUMP LT T(300) T(201) L(L301) L(L302)
LABEL L301
MOVE T(201) T(300)
CJUMP LT T(201) T(200) L(L303) L(L302)
LABEL L303
MOVE T(50) T(201)
MOVE T(201) T(200)
MOVE T(200) T(50)
LABEL L302
JUMP T(555)	# This can only be L201, L202, L203, or L204
LABEL L400
BINOP PLUS T(200) T(201)
MOVE T(500) ACC
BUILTIN printInt T(500)
CJUMP LT T(500) 10 L(L500) L(L501)
L500
JUMP L(L40)
L501

Problem 2.A: Basic Blocks

Identify the basic blocks in the program.

An Answer

I've used LX### for newly introduced labels. Blocks are numbered starting with B001.

B001:
  LABEL LX001       # Added
  CALL printInt 10
  JUMP L(L40)       # Added
  
B002:
  LABEL L40
  CALL readInt
  MOVE T(100) RV
  JUMP L(L50)
  
B003:
  LABEL L51
  CALL readInt
  MOVE T(102) RV
  CALL readInt
  MOVE T(103) RV
  CJUMP LT L(L52) L(L53)

B004:
  LABEL L52
  JUMP L(L53)
  
B005:
  LABEL L50
  CALL readInt
  MOVE T(101) RV
  JUMP L(L51)
  
B006:
  LABEL L53
  MOVE T(200) MAXINT	# The maximum integer
  MOVE T(201) MAXINT	# The maximum integer
  JUMP L(L200)          # Added

B007:
  LABEL L200
  MOVE T(300) T(100)
  MOVE T(555) L(L201)
  JUMP L(L300)
  
B008:
  LABEL L201
  MOVE T(300) T(101)
  MOVE T(555) L(L202)
  JUMP L(L300)

B009:
  LABEL L202
  MOVE T(300) T(102)
  MOVE T(555) L(L203)
  JUMP L(L300)
  
B010:
  LABEL L203
  MOVE T(300) T(103)
  MOVE T(555) L(L204)
  JUMP L(L300)

B011:
  LABEL L204
  JUMP L(L400)
  
B012:
  LABEL L300
  CJUMP LT T(300) T(201) L(L301) L(L302)
  
B013:
  LABEL L301
  MOVE T(201) T(300)
  CJUMP LT T(201) T(200) L(L303) L(L302)
  
B014:
  LABEL L303
  MOVE T(50) T(201)
  MOVE T(201) T(200)
  MOVE T(200) T(50)
  JUMP L(L302)            # Added

B015:
  LABEL L302
  JUMP T(555)	# This can only be L201, L202, L203, or L204

B016:
  LABEL L400
  BINOP PLUS T(200) T(201)
  MOVE T(500) ACC
  BUILTIN printInt T(500)
  CJUMP LT T(500) 10 L(L500) L(L501)

B017:
  LABEL L500
  JUMP L(L40)
  
B018:
  LABEL L501
  END

Problem 2.B: Organize Basic Blocks

Order the basic blocks so that (1) each CJUMP only uses the true label (and falls through to the false code) and (2) as many JUMPs as possible are eliminated.

An Answer

There are, of course, many ways to organize these basic blocks. I've tried to eliminate a number of JUMPs, sometimes by changing the test in CJUMP.

# Start of first trace
B001:
  LABEL LX001
  CALL printInt 10
  # JUMP L(L40) eliminated
B002:
  LABEL L40
  CALL readInt
  MOVE T(100) RV
  # JUMP L(L50) eliminated
B005:
  LABEL L50
  CALL readInt
  MOVE (T101) RV
  # JUMP L(L51) eliminated
B003:
  LABEL L51
  CALL readInt
  MOVE T(102) RV
  CALL readInt
  MOVE T(103) RV
  CJUMP LT L(L52) # L(L53) is now default
  # More careful analysis would tell us to eliminate the CJUMP
B006:
  LABEL L53
  MOVE T(200) MAXINT	# The maximum integer
  MOVE T(201) MAXINT	# The maximum integer
  # JUMP L(L200) eliminated
B007:
  LABEL L200
  MOVE T(300) T(100)
  MOVE T(555) L(L201)
  # JUMP L(L300) eliminated
B012:
  LABEL L300
  CJUMP GE T(300) T(201) L(L302) # Rearranged, L(L301) is now default
B013:
  LABEL L301
  MOVE T(201) T(300)
  CJUMP GE T(201) T(200) L(L302) # Rearranged, L(L303) is now default
B014:
  LABEL L303
  MOVE T(50) T(201)
  MOVE T(201) T(200)
  MOVE T(200) T(50)
  # JUMP L(L302) eliminated
B015:
  LABEL L302
  JUMP T(555)	# This can only be L201, L202, L203, or L204
# End of first trace, start of second trace 
B008:
  LABEL L201
  MOVE T(300) T(101)
  MOVE T(555) L(L202)
  JUMP L(L300)
# End of second trace, start of third trace
B009:
  LABEL L202
  MOVE T(300) T(102)
  MOVE T(555) L(L203)
  JUMP L(L300)
# End of third trace, start of fourth trace
B010:
  LABEL L203
  MOVE T(300) T(103)
  MOVE T(555) L(L204)
  JUMP L(L300)
# End of fourth trace, start of fifth trace
B011:
  LABEL L204
  # JUMP L(L400) eliminated.  Note that L204 and L400 are equiv.
B016:
  LABEL L400
  BINOP PLUS T(200) T(201)
  MOVE T(500) ACC
  BUILTIN printInt T(500)
  CJUMP LT T(500) 10 L(L500) # Dropped jump to L(L501)
B018:
  LABEL L501
  END
B017:
  LABEL L500	
  JUMP L(L40)     # Note that a jump to L500 should be a jump to L40
# End of fifth trace, start of sixth trace
B004:
  LABEL L52
  JUMP L(L53)   # Note that a jump to L52 should be a jump to L53
# End of sixth trace

Problem 2.C: Liveness Analysis

Compute the lifetime of each temporary used in the program. You may find it helpful to number the lines of the program to help with this, but your are not required to do so.

An Answer

I have annotated most instructions with their in, out, def, and use sets. If def or use is not explicitly stated, it is empty. Note that I did not annotate labels, since they provide no additional information. I have not written the lifetimes separately, since they are given by the in and out set.

01 LABEL LX001
   
    in:  Nothing
02 CALL printInt 10
    def: RV A0
    out: Nothing
   
03 LABEL L40
   
    in:  Nothing
04 CALL readInt
    def: RV
    out: RV
   
    in:  RV
05 MOVE T(100) RV
    def: T(100)
    use: RV
    out: T(100)
   
06 LABEL L50
   
    in:  T(100)
07 CALL readInt
    def: RV
    out: T(100) RV
   
    in: T(100) RV
08 MOVE T(101) RV
    def: T(101)
    use: RV
    out: T(100) T(101)
   
09 LABEL L51
   
    in:  T(100)
10 CALL readInt
    def: RV
    out: T(100) RV
   
    in:  T(100) T(101) RV
11 MOVE T(102) RV
    def: T(102)
    use: RV
    out: T(100) T(101) T(102)
   
    in:  T(100) T(102)  
12 CALL readInt
    def: RV
    out: T(100) T(101) T(102) RV
   
    in: T(100) T(101) T(102) RV
13 MOVE T(103) RV
    def: T(103)
    use: RV
    out: T(100) T(101) T(102) T(103)
   
    in:  T(100) T(101) T(102) T(103)
14 CJUMP LT T(102) T(103) L(L52)
    use: T(102) T(103)
    out: T(100) T(101) T(102) T(103)  
   
15 LABEL L53
   
    in:  T(100) T(101) T(102) T(103)  
16 MOVE T(200) MAXINT	# The maximum integer
    def: T(200)
    out: T(100) T(101) T(102) T(103) T(200)
   
    in:  T(100) T(101) T(102) T(103) T(200)
17 MOVE T(201) MAXINT	# The maximum integer
    def: T(201)
    out: T(100) T(101) T(102) T(103) T(200) T(201)
   
18 LABEL L200
   
    in:  T(100) T(101) T(102) T(103) T(200) T(201)
19 MOVE T(300) T(100)
    def: T(300)
    use: T(100)
    out: T(101) T(102) T(103) T(200) T(201) T(300)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300)
20 MOVE T(555) L(L201)
    def: T(555)
    use: Nothing, because the label is a constant.
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
21 LABEL L300
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
22 CJUMP GE T(300) T(201) L(L302)
    use: T(201) T(300)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
23 LABEL L301
   
    in:  T(101) T(102) T(103) T(200) T(300) T(555)
24 MOVE T(201) T(300)
    def: T(201)
    use: T(300)
    out: T(101) T(102) T(103) T(200) T(201) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
25 CJUMP GE T(201) T(200) L(L302)
    use: T(200) T(201)
    out: T(101) T(102) T(103) T(200) T(201) T(555)
   
26 LABEL L303
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
27 MOVE T(50) T(201)
    def: T(50)
    use: T(201)
    out: T(50) T(101) T(102) T(103) T(200) T(555)
   
    in:  T(50) T(101) T(102) T(103) T(200) T(555)
28 MOVE T(201) T(200)
    def: T(201)
    use: T(200)
    out: T(50) T(101) T(102) T(103) T(201) T(555)
   
    in:  T(50) T(101) T(102) T(103) T(201) T(555)
29 MOVE T(200) T(50)
    def: T(200)
    use: T(50)
    out: T(101) T(102) T(103) T(200) T(201) T(555)
   
30 LABEL L302
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
31 JUMP T(555)	# This can only be L201, L202, L203, or L204
    use: T(555)
    out: T(101) T(102) T(103) T(200) T(201) T(555)
      Note that we need to look at the in set for
      L201, L202, L203, and L204.
   
32 LABEL L201
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
33 MOVE T(300) T(101)
    def: T(300)
    use: T(101)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
34 MOVE T(555) L(L202)
    def: T(555)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
35 JUMP L(L300)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
36 LABEL L202
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
37 MOVE T(300) T(102)
    def: T(300)
    use: T(102)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
38 MOVE T(555) L(L203)
    def: T(555)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
39 JUMP L(L300)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
40 LABEL L203
   
    in:  T(101) T(102) T(103) T(200) T(201) T(555)
41 MOVE T(300) T(103)
    def: T(300)
    use: T(103)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
42 MOVE T(555) L(L204)
    def: T(555)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
    in:  T(101) T(102) T(103) T(200) T(201) T(300) T(555)
43 JUMP L(L300)
    out: T(101) T(102) T(103) T(200) T(201) T(300) T(555)
   
44 LABEL L204
   
45 LABEL L400
   
    in:  T(200) T(201)
46 BINOP PLUS T(200) T(201)
    def: ACC
    use: T(200) T(201)
    out: ACC
   
    in:  ACC
47 MOVE T(500) ACC
    def: T(500)
    use: ACC
    out: T(500)
   
    in:  T(500)
48 BUILTIN printInt T(500)
    def: RV,A0
    use: T(500)
    out: T(500)
   
    in:  T(500)
49 CJUMP LT T(500) 10 L(L500)
    use: T(500)
    out: Nothing
   
50 LABEL L501
   
    in:  Nothing
51 END
    out: Nothing
   
52 LABEL L500	
   
    in:  Nothing
53 JUMP L(L40)
    out: Nothing
   
54 LABEL L52
   
    in:  T(100) T(101) T(102) T(103)
55 JUMP L(L53)
    out: T(100) T(101) T(102) T(103)

Problem 2.D: Register Allocation

Determine a mapping of temporaries to registers using only four registers You may need to spill some temporaries to memory, which will require updating the program.

If you spill to memory, use memory locations 31, 32, 33, ....

Note that you can use the standard registers for storing values when they are not used for other purposes. However, they do count towards the four. In this program, the standard registers used are RV (the return value from functions), ACC (the accumulator), and A0 (the argument to builtin printInt).

You need not concern yourself with the instruction counter, frame pointer, stack pointer, and heap pointer and return address. That is, you can not use them explicitly, but they don't count toward the four registers, either.

An Answer

I observe that and have very similar lifetimes, that they life for much of the program, and that they're only used a few time. Hence, I will store each of them in memory location 31, 32, 33, and 34 instead of in registers. While the standard is to save and load with a new register, I'm going to assume that that register has already been determined equal to another register.

01 LABEL LX001
   
    in:  Nothing
02 CALL printInt 10
    def: RV A0
    out: Nothing
   
03 LABEL L40
   
    in:  Nothing
04 CALL readInt
    def: RV
    out: RV
   
    in:  RV
05 MOVE M(31) RV
    use: RV
    out: Nothing
   
06 LABEL L50
   
    in:  Nothing
07 CALL readInt
    def: RV
    out: RV
   
    in: RV
08 MOVE M(32) RV
    use: RV
    out: Nothing
   
09 LABEL L51
   
    in:  Nothing
10 CALL readInt
    def: RV
    out: RV
   
    in:  RV
11 MOVE M(33) RV
    use: RV
    out: Nothing
   
    in: Nothing
12 CALL readInt
    def: RV
    out: RV
   
    in:  RV
13 MOVE M(34) RV
    use: RV
    out: Nothing

    in:  Nothing
14.1 MOVE R(1) M(33)
    def: R(1)
    out: R(1)

    in:  R(1)
14.2 MOVE R(2) M(34)
    def: R(2)
    out: R(1) R(2)

    in: R(1) R(2)
14.3 CJUMP LT R(1) R(2) L(L52)
    use: R(1) R(2)
    out: Nothing
   
15 LABEL L53
   
    in: Nothing
16 MOVE T(200) MAXINT	# The maximum integer
    def: T(200)
    out: T(200)
   
    in:  T(200)
17 MOVE T(201) MAXINT	# The maximum integer
    def: T(201)
    out: T(200) T(201)
   
18 LABEL L200
   
    in:  T(200) T(201)
19 MOVE T(300) M(31)
    def: T(300)
    out: T(200) T(201) T(300)
   
    in:  T(200) T(201) T(300)
20 MOVE T(555) L(L201)
    def: T(555)
    use: Nothing, because the label is a constant.
    out: T(200) T(201) T(300) T(555)
   
21 LABEL L300
   
    in:  T(200) T(201) T(300) T(555)
22 CJUMP GE T(300) T(201) L(L302)
    use: T(201) T(300)
    out: T(200) T(201) T(300) T(555)
   
23 LABEL L301
   
    in:  T(200) T(300) T(555)
24 MOVE T(201) T(300)
    def: T(201)
    use: T(300)
    out: T(200) T(201) T(555)
   
    in:  T(200) T(201) T(555)
25 CJUMP GE T(201) T(200) L(L302)
    use: T(200) T(201)
    out: T(200) T(201) T(555)
   
26 LABEL L303
   
    in:  T(200) T(201) T(555)
27 MOVE T(50) T(201)
    def: T(50)
    use: T(201)
    out: T(50) T(200) T(555)
   
    in:  T(50) T(200) T(555)
28 MOVE T(201) T(200)
    def: T(201)
    use: T(200)
    out: T(50) T(201) T(555)
   
    in:  T(50) T(201) T(555)
29 MOVE T(200) T(50)
    def: T(200)
    use: T(50)
    out: T(200) T(201) T(555)
   
30 LABEL L302
   
    in: T(200) T(201) T(555)
31 JUMP T(555)	# This can only be L201, L202, L203, or L204
    use: T(555)
    out: T(200) T(201) T(555)
      Note that we need to look at the in set for
      L201, L202, L203, and L204.
   
32 LABEL L201
   
    in:  T(200) T(201) T(555)
33 MOVE T(300) M(32)
    def: T(300)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
34 MOVE T(555) L(L202)
    def: T(555)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
35 JUMP L(L300)
    out: T(200) T(201) T(300) T(555)
   
36 LABEL L202
   
    in:  T(200) T(201) T(555)
37 MOVE T(300) M(33)
    def: T(300)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
38 MOVE T(555) L(L203)
    def: T(555)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
39 JUMP L(L300)
    out: T(200) T(201) T(300) T(555)
   
40 LABEL L203
   
    in:  T(200) T(201) T(555)
41 MOVE T(300) M(34)
    def: T(300)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
42 MOVE T(555) L(L204)
    def: T(555)
    out: T(200) T(201) T(300) T(555)
   
    in:  T(200) T(201) T(300) T(555)
43 JUMP L(L300)
    out: T(200) T(201) T(300) T(555)
   
44 LABEL L204
   
45 LABEL L400
   
    in:  T(200) T(201)
46 BINOP PLUS T(200) T(201)
    def: ACC
    use: T(200) T(201)
    out: ACC
   
    in:  ACC
47 MOVE T(500) ACC
    def: T(500)
    use: ACC
    out: T(500)
   
    in:  T(500)
48 BUILTIN printInt T(500)
    def: RV,A0
    use: T(500)
    out: T(500)
   
    in:  T(500)
49 CJUMP LT T(500) 10 L(L500)
    use: T(500)
    out: Nothing
   
50 LABEL L501
   
    in:  Nothing
51 END
    out: Nothing
   
52 LABEL L500	
   
    in:  Nothing
53 JUMP L(L40)
    out: Nothing
   
54 LABEL L52
   
    in:  Nothing
55 JUMP L(L53)
    out: Nothing

Note that we now never have more than four temporaries alive at one time. When we have four temporaries alive, they are T(200) T(201) T(300) and T(555).

Okay, now we're ready for a mapping.

Problem 2.E: Understanding [2 points extra credit]

This part of the problem is optional!

What does this program do?

An Answer

This program repeatedly reads in four integers and prints out their sum. It stops when the sum is at least ten.

Problem 2.F: Improvement [2 points extra credit]

This part of the problem is optional!

It is possible to change the program in such a way that it does not spill any temporaries to memory, and still uses only four registers. However, this requires a somewhat ad-hoc technique. How might you use only four registers?

An Answer

We might observe that the four values read (originally into temporaries T(100), T(101), T(102), and T(103)) need only be read when needed. We can then delete the initial stuff and then replace each assignment to T(300) with

BUILTIN readInt
MOVE T(300) RV

Of course, if we've used the mapping above, we don't even need the MOVE.

This does, of course, require that we eliminate the needless test.


Problem 3: Extending Tiger [40 points]

Suppose we decided to extend Tiger to include referencing (address-of) and dereferencing (contents-of) operations to Tiger. What changes would you have to make to the Tiger language and a typical Tiger compiler to support these new operations?

Your answer should be in a form that an appropriate person (e.g., one of your colleagues in this class) can understand and use as a ``checklist'' in updating the compiler. For each item, include not just what has to be updated (e.g., ``You will need to update the type checker'') but also how one might make the update, in general terms (e.g., ``Update transVar so that when it encounters a reference variable it ...'').

You should assume that the Tiger compiler being modified includes all of the components described in Chapters 1-12 of Appel, including the optional parts, such as an improved commutes operation.

An Answer

The Language

We begin by considering the changes to the language. Since Tiger is a typed language, we need to consider what types to use for references. Neither built-in type (integer and string) fits. Neither constructed type (records and arrays) really fits. Hence, Tiger also needs a Reference type constructor. We'll write this as reference to type-name.

Now, we'll also need a way to initialize references. We'll add a null constant to the language that allows us to initialize references to empty. This is not strictly necessary, but may be useful. As in the case of nil records, null matches many types.

Finally, we need operators to get the address of a variable or the contents of a reference. We'll use prefix address-of and contents-of.

Should we allow other operations on references? I'd prefer not.

We also need to decide whether we permit the programmer to do silly things, like return the address of a local variable or parameter. Since it's hard to check for these situations, we'll permit the programmers to do whatever they desire. That is, they can take the address of anything that has an lvalue.

Syntax

We'll need to update the grammar for those changes. While I expect that almost anyone can figure out the details, here are some of them.

I've chosen to treat contents-of as lvalues, so that you can assign to them. For example, if ir is a reference to an integer variable, then we want to be able to write things like

contents-of ir := 1

meaning ``set whichever variable ir points to to one.''

However, address-of really returns an expression, since we don't want it used as an lvalue. In particular, once we've gotten an address, all we can do with it is assign it to something, or grab its contents. Hence, it could only be used as an lvalue in absurd things like:

contents-of address-of x := 1

which is no different than

x := 1
ty -> reference to id
lvalue -> contents-of lvalue
exp -> address-of lvalue

We'll also need to update the abstract syntax tree to handle these updates. We need three new nodes:

Note that this design does not allow us to get the address of constants. This is almost definitely a good idea. Otherwise, folks can write fascinating things like

twoadd = address-of 2
contents-of twoadd := 5

Type Checking

We'll need to update our tables to support types, although that requires little more than the ReferenceTy node described above.

To type check a ContentsVar, you must

To type check an AddressVar, you must

You'll also need to make sure that the type equivalence algorithm permits assignment of reference variables only when the base type is equivalent.

Stack Frames

No changes are needed to stack frames.

However, the escapes function must be updated to indicate that a variable escapes if its address is taken.

Translation to Intermediate Representation Trees

The IRTs themselves will not be any different. (That is, we don't want new IRT nodes.) However, we do need translation guidelines for the two new nodes.

Everything Else

No changes should be required for the rest of the compiler, since we're producing the same intermediate representation trees.


Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.

Source text last modified Wed Jan 13 17:07:49 1999.

This page generated on Wed Jan 13 17:09:53 1999 by SiteWeaver.

Contact our webmaster at rebelsky@math.grin.edu