Class 34: More Binary Representation
Held Tuesday, April 6
- Real Numbers
- The IEEE Standard
- Reminder: The exam is due on
Wednesday. It is not intended to be overly-long, but I know that
some of you seemed to spend endless hours during break on it. This
is your final chance for questions (and yes, I have done the exam).
- Surprise bug in ArrayBasedSimpleOrderedList has been corrected.
- You'll need to make some other changes (or download the related
interfaces) in order to get it to compile. The easiest thing is
to drop the ``implements'' and ``imports''.
- Assignment 8 is ready. It's
due next Tuesday.
- I'll try to have something that simulates the output later this week.
- In most computers, characters are represented as integers, using
a mapping between integers and characters (and back again).
- For example, one might decide that 'A' was 66, 'B' was 67, and so
on and so forth.
- In designing such a code, you need to consider how many possible
characters you wish to allow. This helps you determine how many
bits or bytes to allow per character.
- It turns out that there are fewer than 128 different characters
available on the standard US keyboard. So, we might use seven bits
(even expanded to eight bits for a whole byte) to represent our
- However, as we incorporate other languages or other symbols (such
as the copyright or registered trademark signs), we may need more bits
- At one point, each manufacturer had its own encoding. This made
transmission of data between machines more complicated than it should
have been, since there had to be translation as data moved between
machines. These days, there are standards.
- The standard on most US-based computers is ASCII, the American
Standard Code for Information Interchange. It uses eight bits per
character. You can determine the
ASCII encoding by typing
man ascii on our HP's.
- At one time, IBM promoted EBCDIC (I have no idea what it stands for,
perhaps "extended binary coding of diverse characters"; a reference
tells me that it's "extended binary-coded decimal interchange code").
One interesting aspect of EBCDIC is that it doesn't code the characters
in sequence (that is, it's not guaranteed that if "A" has code n, then
"B" has code n+1).
- The big coding standard these days is Unicode. Java supports it,
and it's huge. I'm happy if you know it exists (you don't
need to know the details). Unicode uses two bytes per character.
If the upper bits are 0, Unicode is the same as ASCII.
- What if we want to deal with numbers that may have a fractional
part (something after the decimal point)?
- We need to think about the meaning of bits after the point.
Traditionally, we continue the meaning we use in decimal.
- The first bit after the binary point is 2-1. The next bit
is 2-2, and so on and so forth.
- For example 0.1 is 1/2, 0.01 is 1/4, and 0.11 is 3/4.
- Let's try some exercises in conversion (and think about our
- Fraction = decimal = binary
- 1/8 = .125 = ?
- 7/16 = .4375 = ?
- 1/3 = .333... = ?
- 1/10 = .1 = ?
- Observe that this changes the numbers we can represent with
a finite number of digits. For example, our
handout suggests that
2/5 cannot be represented in a finite number of binary digits.
- Nonetheless, this seems like the best way to represent numbers
with fractional parts.
- Are there others? Yes. One might use sets of four bits to
represent decimal digits. This is clearly less efficient.
- However, there are still further design decisions to make. For
example, how do we place the decimal point?
- In fixed-precision or fixed-point representation,
you pick some number of bits that come
after the decimal point, and use those to represent the factional
- This limits your accuracy for small numbers. For example, if you've
only allowed three bits after the decimal point, your accuracy is
limited to about 1/8. This means that you'd represent both (1/16) and
(-3/64) as 0.000.
- This limits the overall size of your numbers. For example, if you've
only allocated 13 bits to the whole part, your largest number can't
be bigger than 214-1 or about 16,000.
- However, computation is relatively cheap. You can simply use standard
integer computation and then shift the decimal point.
- On the other hand, this can limit accuracy.
- To handle the aforementioned problems,
you might instead let the decimal point move (``float'') and use extra
bits to indicate where the decimal point is positioned.
- In floating-point representation, you use something similar
to scientific notation (+/- n.nnnn * 10x), and represent
- the digits (mantissa),
- the exponent, and
- the sign separately.
- For example, in decimal .125 might be represented as
+ for the sign
1.25 for the digits
-1 for the exponent (10-1)
- As in the cases above, some things get a little bit confusing as
we move to binary. In particular, our exponents are powers of
two, instead of powers of ten.
- So, you would not represent .125 as
for the sign
00111101 for the 125
11111111 for the -1 (in two's complement)
- Instead, you might represent .125 as
for the sign
00000001 for 1
11111101 for exponent (-3 in two's complement)
- Because .125 is 1/8 or .001 in fixed-precision binary.
- It turns out that mathematics are complicated in floating point.
Plauger tells us that floating point computations take up as much
microcode to implement the basic floating point operations as it
does to implement everything else on a typical small computer.
- Designing floating point representations (and computation) is still
nontrivial. You must still concern yourselves with a number of issues.
- How many bits will you use for each component?
- How will you represent each component? Signed-magnitude, two's
complement, as a biased value? Will you use the same representation
for each component, or different ones?
- Will you use a separate bit for the sign (in effect, doing
- The IEEE (Institute for Electrical and Electronics Engineers, or
some such) serves as a standards body for many issues in computing.
They issue language, protocol, design, and other standards.
- (The IEEE does a number of other things, but that is the most
pertinent to our current concerns.)
- One of their mostly widely used standards is the IEEE Standard
for Binary Floating-Point Arithmetic (IEEE standard 754)
which discusses not just representation of floating point numbers,
but also computation with those numbers. This standard was
released in 1985.
- As suggested earlier, some of the first issues
in the design of a floating point representation are how to allocate bits
and represent components.
- The IEEE single-precision representation uses 32 bits, with
- one bit for the sign (effectively using signed-magnitude)
- 23 bits for the mantissa
- eight bits for the exponent
- The actual order is
- The mantissa is represented as an unsigned value. The base position
of the decimal is right before the mantissa, although the exponent
can shift it.
- The exponent uses a 127-bias representation.
- But there are also some tricks ...
- The smallest exponent allowed is -126 (represented as 00000001) and the
largest exponent allowed is 127 (represented as 11111110).
- Observe that this leaves 00000000 and 11111111 as undefined exponent
- 00000000 is used for ``close to zero'' and affects the interpretation
of the other bits.
- 11111111 is used for ``error''; the other bits can specify what
kind of error
- In the standard representations (those not close to zero), a special
trick is used to get one more bit of accuracy.
- For nonzero numbers, it's clear that in standard scientific notation,
using binary (+/- b.bbb * 2x), the mantissa will always be
between 1 and 2.
- If it's less than one, we should simply shift the bits left and
decrease the exponent
- If it's more than two, we should shift the bits right
- So, we can just assume the leftmost bit is 1 and not bother including it
in our representation.
- This bit is called the hidden bit.
- In the representations of numbers close to zero, the hidden bit isn't
- What is 0 1000 1000 00000000000000000000000?
- What is 1 1000 1000 00000000000000000000000?
- What is 0 0000 1000 00000000000000000000000?
- What is 0 1000 1000 01000000000000000000000?
- How do you represent 0?
- How do you represent 1?
- What is the smallest number you can represent?
- What is the largest number you can represent?
- Created Monday, January 11, 1999.
- Added short summary on Friday, January 22, 1999.
- Filled in the details on Tuesday, March 29, 1999. Most were based
on outline 35
of CS152 98S.
- Made a few updates on Tuesday, April 6, 1999.