Sorting a collection of records usually involves inspecting one field of this record, called the key, and arranging the records so that their keys are in ascending order. The keys may be natural numbers -- serial numbers, perhaps -- or they may be short strings of digits or other characters, such as the letters of a person's surname or the digits of a ZIP code.
If a key is a short array of characters or other values, or if, like a serial number, it can easily be converted into such an array, it is often possible to use a sorting method known as radix sorting. In radix sorting, one sets up a queue for each possible value of a component of the key. (For instance, if sorting by ZIP codes, one would set up ten queues, one for each digit from 0 to 9.) Then one distributes the records into these component queues by examining the last or least significant component of each keys (so that, for instance, all of the records with ZIP codes ending in 0 would be placed in the 0 queue, all those with ZIP codes ending in 1 in the 1 queue, and so on). Next, one reconstructs the full collection by taking all of the elements in the 0 queue, then all of the elements in the 1 queue, and so on in order; the result is a master queue in which the records are sorted by their last digit.
The next step is to redistribute the records into the component queues, this time according to the next-to-last component of the key, and to reconstruct the master queue from the component queues in the same way as before. Since the distribution process is stable, in the sense that it will not change the order of records with equal keys, the resulting master queue is correctly sorted by the last two digits of the key.
By repeating the distribution and reconstruction steps for each component of the key, from least significant to most significant, one eventually obtains a completely sorted master queue. (If one is sorting by five-digit ZIP codes, for instance, five cycles of distribution and reconstruction are needed.)
Here is an HP Pascal procedure that implements this algorithm:
const
KeySize = { the number of components in a key };
Least = { the least possible value for one component of a key };
Greatest = { the greatest possible value for one component of a key };
type
Component = Least .. Greatest;
KeyType = array [1 .. KeySize] of Component;
Element = record
Key: KeyType;
{ presumably other fields as well }
end;
procedure RadixSort (var Master: Queue);
var
Val: Component;
{ runs through the possible values of a component of the key }
SmallQueue: array [Component] of Queue;
{ a queue for each of those possible values }
Position: 1 .. KeySize;
{ runs through the positions of the components within a key }
Item: Element;
{ one item at a time from the master queue }
begin
{ Set up the component queues. }
for Val := Least to Greatest do
SmallQueue[Val] := CreateQueue;
{ Run through a cycle of distribution and reconstruction for each
component of the key. }
for Position := KeySize downto 1 do begin
{ Distribute items from the master queue into the component queues. }
while not EmptyQueue (Master) do begin
Item := Dequeue (Master);
Enqueue (Item, SmallQueue[Item.Key[Position]])
end;
{ Reconstruct the master queue. }
for Val := Least to Greatest do
while not EmptyQueue (SmallQueue[Val]) do
Enqueue (Dequeue (SmallQueue[Val]), Master)
end;
{ Recycle the (empty) component queues. }
for Val := Least to Greatest do
DeallocateQueue (SmallQueue[Val]);
end;
This implementation presupposes the existence of the five basic queue
functions CreateQueue, EmptyQueue,
Dequeue, Enqueue, and
DeallocateQueue. Here is a module that provides them,
implementing them in terms of singly-linked lists with a header containing
pointers to the first and last components:
{ This module defines an interface for a queue data type and implements it
for HP 9000 Series 700 workstations under HP-UX 9.x, using HP Pascal.
Programmer: John Stone, Grinnell College.
Original version: April 18, 1996.
Last revised: August 5, 1996.
}
{ The Dispose procedure does not actually recycle storage unless the
heap_dispose compiler option is turned on. }
$heap_dispose on$
module Queues;
$search 'queue-element.o'$
import Element;
export
type
Queue = ^QueueHeader;
{ The CreateQueue function constructs and returns an empty queue capable
of any number of elements. }
function CreateQueue: Queue;
{ The EmptyQueue function determines whether a given queue is empty. }
function EmptyQueue (Q: Queue): Boolean;
{ The Dequeue function extracts the oldest element from a non-empty queue
and returns it. It is an error to give an empty queue as the argument
to dequeue. }
function Dequeue (var Q: Queue): Element;
{ The Enqueue procedure adds an element at the end of an existing
queue. }
procedure Enqueue (Item: Element; var Q: Queue);
{ The DeallocateQueue procedure recycles all the storage associated with
a given queue, leaving its argument undefined. }
procedure DeallocateQueue (var Q: Queue);
implement
import
StdErr;
const
{ The following constants are more or less arbitrary integers
signifying various kinds of exceptions that can occur within this
module. }
FirstExceptionCode = 1;
UninitializedQueueException = 1;
DequeueException = 2;
ExceptionException = 3;
LastExceptionCode = 3;
type
Link = ^QueueComponent;
QueueComponent = record
Datum: Element;
Next: Link;
end;
QueueHeader = record
Front, Rear: Link
end;
{ The QueueExceptionHandler procedure, which is not exported, is invoked
whenever one of the preconditions for the successful execution of a
procedure is found to be false. It prints out an appropriate
explanation of the exception just before the program is halted. }
procedure QueueExceptionHandler (ExceptionCode: integer);
begin
if (ExceptionCode < FirstExceptionCode) or
(LastExceptionCode < ExceptionCode) then
ExceptionCode := ExceptionException;
write (StdErr, 'Exception #', ExceptionCode : 1, ' in module Queues: ');
case ExceptionCode of
UninitializedQueueException:
WriteLn (StdErr, 'An operation was applied to an uninitialized ',
'queue.');
DequeueException:
writeln (StdErr, 'An empty queue was passed as argument to the ',
'Dequeue function.');
ExceptionException:
writeln (StdErr, 'The QueueExceptionHandler procedure received ',
'an unknown exception code.');
end
end;
function CreateQueue: Queue;
var
Result: Queue;
{ the queue that is constructed }
begin
New (Result);
Result^.Front := Nil;
Result^.Rear := Nil;
CreateQueue := Result
end;
function EmptyQueue (Q: Queue): Boolean;
begin
Assert (Q <> Nil, UninitializedQueueException, QueueExceptionHandler);
EmptyQueue := (Q^.Front = Nil)
end;
function Dequeue (var Q: Queue): Element;
var
OldLink: Link;
{ a pointer to the component to be removed from the queue }
begin
Assert (Q <> Nil, UninitializedQueueException, QueueExceptionHandler);
Assert (Q^.Front <> Nil, DequeueException, QueueExceptionHandler);
Dequeue := Q^.Front^.Datum;
OldLink := Q^.Front;
Q^.Front := OldLink^.Next;
if Q^.Front = Nil then
Q^.Rear := Nil;
Dispose (OldLink)
end;
procedure Enqueue (Item: Element; var Q: Queue);
var
NewLink: Link;
{ a pointer to the component to be added to the queue }
begin
Assert (Q <> Nil, UninitializedQueueException, QueueExceptionHandler);
New (NewLink);
NewLink^.Datum := Item;
NewLink^.Next := Nil;
if Q^.Rear = Nil then
Q^.Front := NewLink
else
Q^.Rear^.Next := NewLink;
Q^.Rear := NewLink
end;
procedure DeallocateQueue (var Q: Queue);
var
Traverser: Link;
{ a pointer to successive components of the underlying linked list }
Trailer: Link;
{ a similar pointer, lagging one component behind Traverser }
begin
Assert (Q <> Nil, UninitializedQueueException, QueueExceptionHandler);
Traverser := Q^.Front;
while Traverser <> Nil do begin
Trailer := Traverser;
Traverser := Traverser^.Next;
Dispose (Trailer)
end;
Dispose (Q);
Q := Nil
end;
end.
A much faster implementation of the radix sort can be obtained by
manipulating the Link pointers directly; for instance, instead
of using Dequeue and Enqueue to transfer records
from the component queues into the master queue, one could rebuild it by
linking the last item in each component queue to the first item in the
next. However, the handling of the special cases that arise when some of
the component queues are empty obscures the working of the radix-sorting
algorithm, so the slower but simpler version is presented here.
This document is available on the World Wide Web as
http://www.math.grin.edu/~stone/courses/fundamentals/radix-sorting.html