In fact, there are cases in which converting the data back and forth
actually degrades it. For instance, when Social Security numbers are
converted to and from integer values, the SSNs that begin with one or more
leading zeroes lose those digits, so that (for instance) the number that is
read in as 004-79-4226 becomes 4-79-4226 when it
is written out.
However, the string representation is not very economical in its use of space. Even though there are only ten possible digits, a full byte is required for each one. A nine-digit Social Security number requires nine bytes of memory; if it is converted to an integer, it requires only four bytes. If you have lots of Social Security numbers to keep track of and not all that much available memory, this difference could be significant. There's a trade-off between economical use of time (string representations) and economical use of memory (integer representations).
0, 0001 for
1, 0010 for 2, and so on up to 1001 for
9. Two of these four-bit representations can be packed, side
by side, into one byte, so you wind up using only half as many bytes as
there are digits in the decimal numeral. A Social Security number occupies
four and a half (or, realistically, five) bytes of storage.
Since the conversion from a string of digit characters to the BCD
representation simply involves subtracting ord ('0') from each
character and storing the result into the right four bits of the
appropriate byte, it's much faster than a full conversion to integer. For
example, here's how it might be coded in HP Pascal, assuming a
left-justified string of no more than MAXLEN digits:
type
decimal_digit = 0 .. 9;
bcd = packed array [1 .. MAXLEN] of decimal_digit;
procedure digit_string_to_bcd (digit_string: string[MAXLEN];
var result: bcd);
var
bcd_position: integer;
{ counts off half-bytes in the BCD representation }
position: integer;
{ counts off character positions in the string representation }
begin
bcd_position := MAXLEN;
for position := strlen (digit_string) downto 1 do begin
result[bcd_position] := ord (digit_string[position]) - ord ('0');
bcd_position := bcd_position - 1
end;
for bcd_position := bcd_position downto 1
result[bcd_position] := 0
end;
The use of the packed array type ensures (in HP Pascal) that
the four-bit representations of values in the integer subrange 0
.. 9 will be stored two to a byte. Here's how to add two BCD representations:
procedure add_bcd (augend, addend: bcd; var sum: bcd);
var
position: integer;
{ counts off digit positions, from right to left }
carry: integer;
{ a carry from one digit position into the neighboring position to
the left }
place_sum: integer;
{ the sum of the digits in one digit position and the carry into that
position }
begin
carry := 0;
for position := MAXLEN downto 1 do begin
place_sum := augend[position] + addend[position] + carry;
if place_sum < 10 then begin
sum[position] := place_sum;
carry := 0
end
else begin
sum[position] := place_sum - 10;
carry := 1
end
end
end;
An overflow occurs if the value of carry is 1 when the
for-loop is finished; a fully developed implementation would
include code to handle this situation. Pascal doesn't permit direct comparison of two BCD representations, but they too are easily written and reasonably efficient. As an example, here's a ``less than'' function:
function less_than_bcd (first, second: bcd): Boolean;
var
finished: Boolean;
{ indicates whether more digits must be compared }
position: integer;
{ counts off digit positions in both bcds }
first_digit, second_digit: integer;
{ corresponding single-digit values from first and second }
begin
finished := FALSE;
position := 1;
while not finished do begin
first_digit := first[position];
second_digit := second[position];
if first_digit < second_digit then begin
less_than_bsd := TRUE;
finished := TRUE
end
else if second_digit < first_digit then begin
less_than_bsd := FALSE;
finished := TRUE
end
else if position = MAXLEN then begin { identical bcds }
less_than_bsd := FALSE
finished := TRUE
end
else
position := position + 1
end
end;
The reason for recovering the individual digits and storing them in
separate variables is that the subscripting operation on packed arrays is
less efficient than for ordinary arrays; the processor must recover a whole
byte and then extract just the part of the byte in which the array element
is stored. So it makes sense to perform that operation only once and to
save the result.
pack and unpackunpack to create a non-packed version of the array,
operate on the result, and then use pack to assemble any
results or changes into the smaller structures. For instance, one might
revise the add_bsd procedure above so that all the
subscripting is performed on non-packed arrays:
procedure add_bcd (augend, addend: bcd; var sum: bcd);
type
non_packed_bcd = array [1 .. MAXLEN] of decimal_digit;
var
np_augend, np_addend, np_sum: non_packed_bcd;
{ non-packed versions of the augend, addend, and sum }
position: integer;
{ counts off digit positions, from right to left }
carry: integer;
{ a carry from one digit position into the neighboring position to
the left }
place_sum: integer;
{ the sum of the digits in one digit position and the carry into that
position }
begin
unpack (augend, np_augend, 1);
unpack (addend, np_addend, 1);
carry := 0;
for position := MAXLEN downto 1 do begin
place_sum := np_augend[position] + np_addend[position] + carry;
if place_sum < 10 then begin
np_sum[position] := place_sum;
carry := 0
end
else begin
np_sum[position] := place_sum - 10;
carry := 1
end
end;
pack (np_sum, 1, sum)
end;
Whether this technique actually saves any time can be expected to vary from
one machine architecture to another and from one Pascal compiler to
another.
This document is available on the World Wide Web as
http://www.math.grin.edu/~stone/courses/fundamentals/bcd.html