Assume that
- the keys of the items that we wish to sort lie in a small fixed range and
- that there is only one item with each value of the key.
- Set up an array of "bins" - one for each value of the key - in order,
- Examine each item and use the value of the key to place it in the appropriate bin.
Constraints on bin sort
To understand these restrictions, let's be a little more precise about the specification of the problem and assume that there are m values of the key. To recover our sorted collection, we need to examine each bin. This adds a third step to the algorithm above,- Examine each bin to see whether there's an item in it.
For example, if we wish to sort 104 32-bit integers, then m = 232 and we need 232 operations (and a rather large memory!).
For n = 104:
For n = 104:
An implementation of bin sort might look like:
#define EMPTY -1 /* Some convenient flag */ void bin_sort( int *a, int *bin, int n ) { int i; /* Pre-condition: for 0<=i<n : 0 <= a[i] < M */ /* Mark all the bins empty */ for(i=0;i<M;i++) bin[i] = EMPTY; for(i=0;i<n;i++) bin[ a[i] ] = a[i]; } main() { int a[N], bin[M]; /* for all i: 0 <= a[i] < M */ .... /* Place data in a */ bin_sort( a, bin, N );
If there are duplicates, then each bin can be replaced by a linked list. The third step then becomes:
- Link all the lists into one list.
In contrast to the other sorts, which sort in place and don't require additional memory, bin sort requires additional memory for the bins and is a good example of trading space for performance.
so that we would normally use memory rather profligately to obtain performance, memory consumes power and in some circumstances, eg computers in space craft, power might be a higher constraint than performance. |
Having highlighted this constraint, there is a version of bin sort which can sort in place:
#define EMPTY -1 /* Some convenient flag */ void bin_sort( int *a, int n ) { int i; /* Pre-condition: for 0<=i<n : 0 <= a[i] < n */ for(i=0;i<n;i++) if ( a[i] != i ) SWAP( a[i], a[a[i]] ); }However, this assumes that there are n distinct keys in the range 0 .. n-1. In addition to this restriction, the SWAP operation is relatively expensive, so that this version trades space for time.
The bin sorting strategy may appear rather limited, but it can be generalised into a strategy known as Radix sorting.