CSC 161 Grinnell College Spring, 2010 Imperative Problem Solving and Data Structures

# Laboratory Exercise on the Representation of Floating-Point Numbers and Its Consequences

## Goals

This lab provides experience viewing the representation of floating-point real numbers on PC/Linux machines, and explores an application for which numerical round-off error has visible consequences.

You should read IEEE floating-point representations of real numbers by John Stone.

## Binary Representation of Floating-Point Numbers:

The first part of this lab asks you to review the bit-level storage of floating point numbers on PC/Linux computers.

1. Write the real numbers ± 1, ± 2, ± 3, ± 6, ± 9 using the IEEE Standard for 32-bit Floating Point Numbers..

2. Copy the program ~walker/c/data-rep.c to your account and compile it. Then enter real numbers, and conduct experiments to determine:

• which bit is the sign bit,
• which bits are used for the mantissa,
• which bits are used for the exponent, and
• what bias or excess is used in the storage of exponents.

3. Use your knowledge of storage of real numbers to determine what real number comes "immediately after" 3.0 and 10.0 on this system.

## Computing Area Under y = x2:

[The following is an edited version of Section 5.5 from Introduction to Computing and Computer Science with Pascal by Henry M. Walker, Little, Brown, and Company, 1986 and is used with permission of the copyright holder.]

Suppose we are given a function y = f(x), and we want to find the area under the graph between x = a and x = b.
(The following figure illustrates the area under the curve between x = 1 and x = 3 when f(x) = x2.)

### Discussion

In our solution, we will not try to compute the desired area exactly. Rather, we will consider a fairly simple approach, called the trapezoidal rule, which can give good approximations to the area. In this approach, we break down a large area into small pieces and approximate each of the small pieces by a trapezoid (as shown below).

From geometry, we we can compute the area of a trapezoid:

Then we can approximate the entire area under the curve by adding up the areas of the trapezoids.

More precisely, we first divide the interval [a, b] into n equal pieces a=x0, x1, x2, . . ., xn=b. Then we use the pieces to divide the overall areas into trapezoids. After we compute the area of each trapezoids, we add up these small areas. The final formula is

Approximate Area = h[f(x0)/2 + f(x1) + f(x2) + . . . + f(xn-1) + f(xn)/2)]

where h = (b - a) / n and xj = a + jh for j = 0, 1, 2,  . ., n. This is the formula trapezoidal rule. (The interested reader should consult books in calculus or numerical methods for the details of this and other methods.)

To make this formula more concrete, we apply it to f(x) = x2 between x = 1 and x = 3 (as shown in an earlier figure), and we divide the interval ]1, 3] into five pieces. This gives: n = 5; a = 1; b = 3. The overall interval [1, 3] has length 2; we divide it into five subintervals of length h = 2/5 = 0.4. The x values are x0 = 1, x1 = 1.4, x2 = 1.8, x3 = 2.2, x4 = 2.6, x5 = 3. The trapezoidal rule gives:

 Approx. Area = h[f(x0)/2 + f(x1) + f(x2)+ f(x3)+ f(x4)+ f(x5)/2)] = 0.4[f(1)/2 + f(1.4) + f(1.8) + f(2.2) + f(2.6) + f(3)/2] = 0.4]12/2 + (1.4)2 + (1.8)2 + (2.2)2 + (2.6)2 + 32/2] = 8.72

#### Theoretical Accuracy of the Trapezoidal Rule

While it is hard to predict the accuracy of approximations with the trapezoidal rule, we can make several useful observations.

• The trapezoidal rule relies upon the actual area under the graph being close to the area under the trapezoid.

• If the graph of the function is a straight line, then the trapezoids should give exact results. Otherwise the trapezoidal rule cannot be expected to be exactly correct.

• If we divide the interval [a, b] into a large number of pieces, we can expect each trapezoid to be close to the actual area under the graph.

• As n gets bigger, the approximation of area using the trapezoidal rule should get better.

#### Practical Implications of Floating Point Error

Since floating-point numbers are not stored exactly, work with any individual floating point number may involve a small amount of error. If these numbers are combined in many arithmetic operations, such small numerical errors sometimes can come together to significantly affect results.

#### Programming

This part of the lab asks you to write (or modify) a program that computes area using the trapezoidal rule in various ways. You then will experiment with this program to investigate the effect of numerical errors.

1. Write a simple C program (without functions) that does the following:

• read n, the number of trapezoids to be used in the computation,
• compute the area under f(x) = x2 between x = 1 and x = 3, initializing a summing variable to (f(1) + f(3))/2) and then summing the intermediate values in order of ascending x values,
• compute this areas, so the entire sum goes in ascending order of x values,
• compute this areas, so the entire sum goes in descending order of x values, and
• print the above 3 approximate areas in a table, with answers given to at least 6 decimal places.

The program must use float (NOT double) variables for all real numbers (as this will highlight numerical error issues).

2. Run your program for n = 5, 500, 800, 1000, 1200, 1500, 2000, 5000, 10000, and 100000.
(Note, the case n = 5 is computed above, so you can test your program works.)

3. Compute the exact area under this function (using calculus), and compare the exact answer with the various approximations. What conclusions can you make regarding the accuracy of the trapezoidal for various values of n?

Note: The Introduction to C Through Annotated Examples gives several versions of the trapezoidal rule, using functions and other fanciness. While you are welcome to use these examples as a base, your code should be simpler. Also, your code will need to perform the computations in several ways, as noted above.

## Work to turn in:

• Written answers for steps 1, 2, and 3.
• A program listing for step 4.
• Test runs for step 5.
• A computation and commentary for step 6.

This document is available on the World Wide Web as

```     http://www.cs.grinnell.edu/~walker/courses/161.sp10/lab-floats.shtml
```