# Outline of Class 35: More Binary Representation

Held: Monday, April 6, 1998

• If you haven't done so already, read the handout on IEEE representation of real numbers.
• Start reading chapter 8 of Bailey.
• Any questions on assignment five?
• Today's brown-bag lunch is on The Design of C++. C++ is an object-oriented language with a syntax not unlike Javas, but with some very different design decisions. I encourage you to attend (but I won't be there due to a prior commitment).

## Representing Different Ranges

• One disadvantage of the three standard representations of signed integers (signed magnitude, one's complement, two's complement) is that all three support only a fixed range of values.
• In a biased representation of a range of integers, you select a bias (offset) and then (traditionally) use the standard positive-only representation.
• If the bias is b, you represent n using the positive-only representation of b+n.
• To represent the numbers from -1 to 254 in one byte, you use a bias of 1.
• To represent the numbers from -255 to 0 in one byte, you use a bias of 255.
• Alternately, you can think of a biased representation as taking the series of bits, computing the corresponding positive integer, and then subtracting the bias to determine the actual value represented.
• The most typical bias is 2^(m-1), where m is the number of bits used to represent the number. This is called excess 2^(m-1).
• For one byte, the bias is 128. This means that the smallest number we can represent is -128 and the largest is 127.

## Representing Characters in Binary

• In most computers, characters are represented as integers, using a mapping between integers and characters (and back again).
• For example, one might decide that 'A' was 66, 'B' was 67, and so on and so forth.
• In designing such a code, you need to consider how many possible characters you wish to allow. This helps you determine how many bits or bytes to allow per character.
• It turns out that there are fewer than 128 different characters available on the standard US keyboard. So, we might use seven bits (even expanded to eight bits for an even byte) to represent our characters.
• However, as we incorporate other languages or other symbols (such as the copyright or registered trademark signs), we may need more bits and bytes.
• At one point, each manufacturer had its own encoding. This made transmission of data between machines more complicated than it should be. These days, there are standards.
• The standard on most US-based computers is ASCII, the American Standard Code for Information Interchange. It uses eight bits per character. You can determine the ASCII encoding by typing `man ascii` on our HP's.
• At one time, IBM promoted EBCDIC (I have no idea what it stands for, perhaps "extended binary coding of diverse characters"; a reference tells me that it's "extended binary-coded decimal interchange code"). One interesting aspect of EBCDIC is that it doesn't code the characters in sequence (that is, it's not guaranteed that if "A" has code n, then "B" has code n+1).
• The big coding standard these days is Unicode. Java supports it, and it's huge. I'm happy if you know it exists (you don't need to know the details). Unicode uses two bytes per character.

## Representing Real Numbers in Binary

• What if we want to deal with numbers that may have a fractional part (something after the decimal point)?
• We need to think about the meaning of bits after the point. Traditionally, we continue the meaning we use in decimal.
• The first bit after the binary point is 2^-1. The next bit is 2^-2, and so on and so forth.
• For example 0.1 is 1/2, 0.01 is 1/4, and 0.11 is 3/4.
• Let's try some exercises in conversion (and think about our conversion algorithm)
• Fraction = decimal = binary
• 7/16 = .4375 = ?
• 1/3 = .333... = ?
• 1/10 = .1 = ?
• Observe that this changes the numbers we can represent with a finite number of digits. For example, our handout suggests that 2/5 cannot be represented in a finite number of binary digits.
• Nonetheless, this seems like the best way to represent numbers with fractional parts.
• Are there others? Yes. One might use sets of four bits to represent decimal digits. This is clearly less efficient.
• However, there are still further design decisions to make. For example, how do we place the decimal point?

### Fixed Point

• In fixed-precision or fixed-point representation, you pick some number of bits that come after the decimal point, and use those to represent the factional part.
• This limits your accuracy for small numbers. For example, if you've only allowed three bits after the decimal point, your accuracy is limited to about 1/8. This means that you'd represent both (1/16) and (-3/64) as 0.000.
• This limits the overall size of your numbers. For example, if you've only allocated 13 bits to the whole part, your largest number can't be bigger than 2^14-1 or about 16,000.
• However, computation is relatively cheap. You can simply use standard integer computation and then shift the decimal point.
• On the other hand, this can limit accuracy.

### Floating Point

• To handle the aforementioned problems, you might instead let the decimal point move ("float") and use extra bits to indicate where the decimal point is positioned.
• In floating-point representation, you use something similar to scientific notation (+/- n.nnnn * 10^x), and represent
• the digits,
• the exponent, and
• the sign separately.
• For example, in decimal .125 might be represented as
• `+` for the sign
• `12` for the twelve
• `-1` for the exponent (10^-1)
• As in the cases above, some things get a little bit confusing as we move to binary. In particular, our exponents are powers of two, instead of powers of ten.
• So, you would not represent .125 as
• for the sign
• `00111101` for the 125
• `11111111` for the -1 (in two's complement)
• Instead, you might represent .125 as
• for the sign
• `00000001` for 1
• `11111101` for exponent (-3 in two's complement)
• Because .125 is 1/8 or .001 in fixed-precision binary.
• It turns out that mathematics are complicated in floating point. Plauger tells us that floating point computations take up as much microcode to implement the basic floating point operations as it does to implement everything else on a typical small computer.
• Designing floating point representations (and computation) is still nontrivial. You must still concern yourselves with a number of issues.
• How many bits will you use for each component?
• How will you represent each component? Signed-magnitude, two's complement, as a biased value? Will you use the same representation for each component, or different ones?
• Will you use a separate bit for the sign (in effect, doing signed-magnitude)?

### IEEE Standard Representation

• The IEEE (Institute for Electrical and Electronics Engineers, or some such) serves as a standards body for many issues in computing. They issue language, protocol, design, and other standards.
• (The IEEE does a number of other things, but that is the most pertinent to our current concerns.)
• One of their mostly widely used standards is the IEEE Standard for Binary Floating-Point Arithmetic (IEEE standard 754) which discusses not just representation of floating point numbers, but also computation with those numbers. This standard was released in 1985.
• As suggested earlier, some of the first issues in the design of a floating point representation are how to allocate bits and represent components.
• The IEEE single-precision representation uses 32 bits, with
• one bit for the sign (effectively using signed-magnitude)
• 23 bits for the mantissa
• eight bits for the exponent
• The actual order is
`SEEEEEEEEMMMMMMMMMMMMMMMMMMMMMMM`
• The mantissa is represented as an unsigned value. The base position of the decimal is right before the mantissa, although the exponent can shift it.
• The exponent uses a 127-bias representation.
• But there are also some tricks ...
• The smallest exponent allowed is -126 (represented as 00000001) and the largest exponent allowed is 127 (represented as 11111110).
• Observe that this leaves 00000000 and 11111111 as undefined exponent strings.
• 00000000 is used for "close to zero" and effects other issues
• 11111111 is used for "error"
• In the standard representations (those not close to zero), a special trick is used to get one more bit of accuracy.
• For nonzero numbers, it's clear that in standard scientific notation, using binary (+/- b.bbb * 2^x), the mantissa will always be between 1 and 2.
• If it's less than one, we should simply shift the bits left and decrease the exponent
• If it's more than two, we should shift the bits right
• So, we can just assume the leftmost bit is 1 and not bother including it in our representation.
• This bit is called the hidden bit.
• In the representations of numbers close to zero, the hidden bit isn't used.
• Exercises
• What is 0 1000 1000 00000000000000000000000?
• What is 1 1000 1000 00000000000000000000000?
• What is 0 0000 1000 00000000000000000000000?
• What is 0 1000 1000 01000000000000000000000?
• How do you represent 0?
• How do you represent 1?
• What is the smallest number you can represent?
• What is the largest number you can represent?

On to Other Machine-Level Issues
Back to Introduction to Machine Representation
Outlines: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53
Current position in syllabus

Disclaimer Often, these pages were created "on the fly" with little, if any, proofreading. Any or all of the information on the pages may be incorrect. Please contact me if you notice errors.