CSCE 156

Handout 9:  Search

 

March 20, 2007

 

Definitions

 

Using search, you can:

·        determine whether a particular item is in the list

·        if the data is specially organized, find the location in the list where a new item can be inserted

·        find the location of an item to be deleted

A search algorithm’s performance is crucial.

 

Associated with each item in a data set is a special member that uniquely identifies the item in the data set.  This is called the key of the item.  We can use this to compare.

 

Linear Search

 

Algorithm.  The algorithm for linear search is straightforward:

 

 

Start algo (given “target” and given the array of elements, “array”)

   Set found to false

   Set index to 0

   While (found is false AND index is less than length of array)

      If (array element at index is the same as target)

         Set found to true

      Else

         Increment index

      Endif

   Endwhile

   Return found

End algo

 

 

Analysis.  The statements before and after the loop are executed only once, and hence require very little time.  The statements in the while loop are the ones that are repeated several times.

·        The loop terminates as soon as the search item is found in the list.

·        Therefore, the execution of the other statements in the loop is directly related to the outcome of the key comparison.

·        Therefore, when analyzing a search algorithm, we count the number of key comparisons because this number gives us the most useful information. 

 

If the search item, called the target, is the first element in the list, one comparison is required.  If second, then two; if kth, then k comparisons are required.

 

If there are n elements in the list, then the average number of comparisons is:

 

It is known that .  Therefore:  we have .  Therefore, on average, the linear search searches half the list.  It thus follows that if the list size is 1,000,000 on average, the search makes 500,000 comparisons.  Thus, a linear search algorithm is not efficient for large lists.

 

Binary Search

 

Algorithm.  The algorithm for linear search only works on ordered lists.  It uses “divide and conquer” technique to search the list.  First, the search item is compared with middle element of the list.  If the search item is less than the middle element of the list, we restrict the search to the first half of the list; otherwise, we search the second half of the list.

 

 

Start algo (given “target” and given the array of elements, “array”)

   Set first to 0

   Set last to length-1

   Set found to false

   While (found is false AND first is less than or equal to last)

      Set mid to (first+last)/2

      If (array element at mid is the same as target)

         Set found to true

      Else {

         If (array element at mid is greater than target)

            Set last to mid-1

         Else

            Set first to mid+1

         Endif

         }

      Endif

   Endwhile

   If (found is true)

      Return mid

   Else

      Return -1

   Endif

End algo

 

 

Analysis.  Every iteration of the while loop cuts the size of the search list in half.  For every iteration, the while loop makes 2 key comparisons.  In general, if L is a sorted list of size n, to determine whether or not an element is in L, a binary search makes at most  key comparisons.

 

Lower Bound on Comparison-Based Search Algorithms

 

Theorem:  Let L be a list of size n > 1.  Suppose that the elements of L are sorted.  If SRH(n) denotes the minimum number of comparisons needed, in the worst case, by using a comparison-based algorithm to recognize whether an element x is in L, then SRH(n) >= .

 

Corollary:  The binary search algorithm is the optimal worst-case algorithm for solving search problems by the comparison method.

 

·        Based on (Malik 2003)