Get new post automatically.

Enter your email address:


4.2 Binary Search

However, if we place our items in an array and sort them in either ascending or descending order on the key first, then we can obtain much better performance with an algorithm called binary search.
In binary search, we first compare the key with the item in the middle position of the array. If there's a match, we can return immediately. If the key is less than the middle key, then the item sought must lie in the lower half of the array; if it's greater then the item sought must lie in the upper half of the array. So we repeat the procedure on the lower (or upper) half of the array.
Our FindInCollection function can now be implemented:

static void *bin_search( collection c, int low, int high, void *key ) {
 int mid;
 /* Termination check */
 if (low > high) return NULL;
 mid = (high+low)/2;
 switch (memcmp(ItemKey(c->items[mid]),key,c->size)) {
  /* Match, return item found */
  case 0: return c->items[mid];
  /* key is less than mid, search lower half */
  case -1: return bin_search( c, low, mid-1, key);
  /* key is greater than mid, search upper half */
  case 1: return bin_search( c, mid+1, high, key );
  default : return NULL;
  }
 }

void *FindInCollection( collection c, void *key ) {
/* Find an item in a collection
   Pre-condition: 
 c is a collection created by ConsCollection
 c is sorted in ascending order of the key
 key != NULL
   Post-condition: returns an item identified by key if
   one exists, otherwise returns NULL
*/
 int low, high;
 low = 0; high = c->item_cnt-1;
 return bin_search( c, low, high, key );
}
Points to note:
  1. bin_search is recursive: it determines whether the search key lies in the lower or upper half of the array, then calls itself on the appropriate half.
  2. There is a termination condition (two of them in fact!)
    1. If low > high then the partition to be searched has no elements in it and
    2. If there is a match with the element in the middle of the current partition, then we can return immediately.
  3. AddToCollection will need to be modified to ensure that each item added is placed in its correct place in the array. The procedure is simple:
    1. Search the array until the correct spot to insert the new item is found,
    2. Move all the following items up one position and
    3. Insert the new item into the empty position thus created.
  4. bin_search is declared static. It is a local function and is not used outside this class: if it were not declared static, it would be exported and be available to all parts of the program. The static declaration also allows other classes to use the same name internally.
    static reduces the visibility of a function an should be used wherever possible to control access to functions!

Analysis

Each step of the algorithm divides the block of items being searched in half. We can divide a set of n items in half at most log2 n times. Thus the running time of a binary search is proportional to log n and we say this is a O(log n) algorithm.
Binary search requires a more complex program than our original search and thus for small n it may run slower than the simple linear search. However, for large n,






Thus at large n, log n is much smaller than n, consequently an O(log n) algorithm is much faster than an O(n) one.







Plot of n and log n vs n .
We will examine this behaviour more formally in a later section. First, let's see what we can do about the insertion (AddToCollection) operation.
In the worst case, insertion may require n operations to insert into a sorted list.

  1. We can find the place in the list where the new item belongs using binary search in O(log n) operations.
  2. However, we have to shuffle all the following items up one place to make way for the new one. In the worst case, the new item is the first in the list, requiring n move operations for the shuffle! A similar analysis will show that deletion is also an O(n) operation.
    If our collection is static, ie it doesn't change very often - if at all - then we may not be concerned with the time required to change its contents: we may be prepared for the initial build of the collection and the occasional insertion and deletion to take some time. In return, we will be able to use a simple data structure (an array) which has little memory overhead.
    However, if our collection is large and dynamic, ie items are being added and deleted continually, then we can obtain considerably better performance using a data structure called a tree.

    Key terms

    Big Oh
    A notation formally describing the set of all functions which are bounded above by a nominated function.
    Binary Search
    A technique for searching an ordered list in which we first check the middle item and - based on that comparison - "discard" half the data. The same procedure is then applied to the remaining half until a match is found or there are no more items left.



========================================== 

Binary Search is an incredibly powerful technique for searching an ordered list. It is familiar to everyone who uses a telephone book!
The basic algorithm is to find the middle element of the list, compare it against the key, decide which half of the list must contain the key, and repeat with that half.
Two requirements to support binary search:
  • Random access of the list elements, so we need arrays instead of linked lists.
  • The array must contain elements in sorted order by the search key.
Why Do Twenty Questions Suffice?
With one question, I can distinguish between two words: A and B; ``Is the key tex2html_wrap_inline265 ?''
With two questions, I can distinguish between four words: A,B,C,D; ``Is the tex2html_wrap_inline269 ?''
Each question I ask em doubles the number of words I can search in my dictionary.
tex2html_wrap_inline271 , which is much larger than any portable dictionary!
Thus I could waste my first two questions because tex2html_wrap_inline273 .
Exponents and Logs
Recall the definitions of exponent and logarithm from high school:
displaymath248
Thus exponentiation and logarithms are inverse functions, since tex2html_wrap_inline275 .
Note that the logarithm of a big number is a much smaller number.
Thus the number of questions we must ask is the base two logarithm of the size of the dictionary.
displaymath249
Implementing Binary Search
Although the algorithm is simple to describe informally, it is tricky to produce a working binary search function. The first published binary search algorithm appeared in 1946, but the first correct published program appeared in 1962!
The difficulty is maintaining the following two invariants with each iteration:
  • The key must always remain between the low and high indices.
  • The low or high indice must advance each iteration.
The boundary cases are very tricky: zero elements left, one elements left, two elements left, and an even or odd number of elements!
Versions of Binary Search
There are at least two different versions of binary search, depending upon whether we want to test for equality at each query or only at the end.
For the later, suppose we want to search for ``k'':
iteration  bottom top      mid
---------------------------------------
1          2      14       (1+14)/2=7
2          1      7        (1+7)/2=4
3          5      7        (5+7)/2=6
4          6      7        (7+7)/2=7
Since tex2html_wrap_inline277 , 7 is the right spot. However, we must now test if entry[7]='k'. If not, the item isn't in the array.
Alternately, we can test for equality at each comparison. Suppose we search for ``c'':
iteration  bottom  top  mid
------------------------------------
1          1       14   (1+14)/2 = 7
2          1       6    (1+6)/2 = 3
3          1       2    (1+2)/2 = 1
4          2       2    (2+2)/2 = 2
Now it will be found!
Recursive Binary Search Implementation
PROCEDURE Search( READONLY array: ARRAY [0 .. MaxInd - 1] OF INTEGER;
                  left, right: [0 .. MaxInd - 1];
                  argument: INTEGER): [0..MaxInd] =
  (*Implements binary search in an array*)
  VAR 
    middle := left + (right - left) DIV 2;
    
  BEGIN (* binary search *)
    IF argument = array[middle] THEN      (*found*)
      RETURN middle                 
    ELSIF argument < array[middle] THEN   (*search in left half*)
      IF left < middle THEN
        RETURN Search(array, left, middle - 1, argument)
      ELSE                                (*left boundary reaches middle: not found*)
        RETURN MaxInd              
      END (*IF left < middle*)
    ELSE                                  (*search in right half*)
      IF middle < right THEN
        RETURN Search(array, middle + 1, right, argument)
      ELSE                                (*middle reaches right boundary: not found*)
        RETURN MaxInd 
      END (*IF middle < right*)
    END (*IF argument = array[middle]*)
  END Search;

========================================

Binary search is an algorithm for finding the location of an element in a sorted list. First, it checks the middle element of the sort list; if that element is equal to the sought value then the location has been found; Othewise, the lower half or upper half is chosen for next searching based on the sought value less than or greater than the middle element. By doing so, this binary search reduces the number of elements to be checked by a factor of two each time searching and find the element in logarithmic time.
Binary Search Flowchart

Binary Search implemented in C.

int binary_search(int sorted_list[], int low, int high, int element) {
02    while (low <= high) {
03        int middle = low + (high - low)/2;
04        if (element > sorted_list[middle])
05            low = middle + 1;
06        else if (element < sorted_list[middle])
07            high = middle - 1;
08        else
09            return middle;
10    }
11    return -1;
12}

Binary Search in using recursion technique.

01int binary_search(int sorted_list[], int low, int high, int element) {
02    if (high < low)
03        return -1;
04    int middle = low + (high - low)/2;
05    if (element < sorted_list[middle])
06        return binary_search(sorted_list, low, middle-1, element);
07    else if (element > sorted_list[middle])
08        return binary_search(sorted_list, middle+1, high, element);
09    else
10        return middle;
11}

Example: In C Programming

#include<stdio.h>
#include<conio.h>
void main()
{
int num,i,ar[20],m,low,high,mid,pos=1;
clrscr();
printf("Enter any number of element you want to insert=\t");
scanf("%d",&num);
printf("enter the list in non incresing order \n");
for(i=0;i<num;i++)
{
scanf("%d",&ar[i]);
}
printf("please enter the number you want to search\t");
scanf("%d",&m);
low=0;
high=num-1;
do
{
mid=(low+high)/2;
if(m<ar[mid])
high=mid-1;
else if(m>ar[mid])
low=mid+1;
}while(m!=ar[mid]&&low<=high);
if(m==ar[mid])
{
printf("\nsearch sucessful");
}
else
{
printf("\nsearch unsuccessful");
}
getch();
}

Answer: