Home CPSC 340

Searching Algorithms


 

Overview

Searching and sorting are two of the most common sorts of algorithms there are. Most decent-sized programs will need to search and/or sort some place in them. They are also a key idea in many algorithms. For instance, data structures like hash tables and binary search trees are designed with search in mind. And heaps maintain a sorted order internally.


 

Linear Search

The simplest search algorithm is called "linear search". It works by starting at the beginning of the things we are searching through and examining them, one by one, until we find the one we are looking for. The algorithm is given below:

  1. Set index to 0.
  2. While index is less than the length of the array:
    1. If the item at index is the one we're searching for, we're done!
    2. Set index to index + 1.
  3. If we got through the whole array without finding it, it wasn't there.

This algorithm is of course linear (hence the name), so it's $O(n)$.

A Java implementation is available in LinearSearch.java.


public static boolean linearSearch(int[] array, int target) {
    for (int i = 0; i < array.length; i++) {
        System.out.println("Searching at " + i);

        if (array[i] == target) {
            System.out.println("Found " + target + " in slot " + i + "\n");
            return true;
        }
    }

    System.out.println(target + " was not found.\n");
    return false;
}

 

Binary Search

If our array of items is sorted in order, we actually can do better than linear search.

To do this, we can start the search in the middle of the array. After that, we know that the item is either there (in which case we're done), or it's to the left, or it's to the right. Then we pick the middle of the left half, or the right half, and repeat the process.

We can use recursion in our binary search, coming up with an algorithm like this:

  1. If the array is empty, the item wasn't here.
  2. Find the middle item from the array.
  3. If it's the thing, we're looking for, we're done.
  4. If it's less, recursively search the left half.
  5. If it's more, recursively search the right half.

A Java implementation is available in BinarySearch.java.


public static boolean binarySearch(int[] array, int target, int left, int right) {
    if (left > right) {
        System.out.println(target + " was not found.\n");
        return false;
    }

    int middle = (left + right) / 2;
    System.out.println("Searching at " + middle);

    if (array[middle] == target) {
        System.out.println("Found " + target + " in slot " + middle + "\n");
        return true;
    } else if (array[middle] < target) {
        return binarySearch(array, target, middle + 1, right);
    } else {
        return binarySearch(array, target, left, middle - 1);
    } 
}

public static boolean binarySearch(int[] array, int target) {
    return binarySearch(array, target, 0, array.length - 1);
}

Like other recursive methods, this includes a helper method which sets the initial search parameters.


 

Binary Search Analysis

With binary search, every time we make a guess, we cut the possibilities in half. If we are guessing a number from 1 to 100, how many guesses will we make in the worst case scenario?

The most number of guesses we can make is the number of times we can divide 100 in half before we hit 0. When we divide 100 by two (rounding down) repeatedly we get:


100 50 25 12 6 3 1 0

So we can guess at most 7 times before we run out of possibilities. What if we want to guess a number from 1 to 1000?


1000 500 250 125 62 31 15 7 3 1 0

So we can guess at most 10 times. In general we have:

$searches = \lceil log_2(n) \rceil$

Highest NumberMax Linear Search GuessesMax Binary Search Guesses
10104
1001007
1,0001,00010
10,00010,00014
100,000100,00017
1,000,0001,000,00020
1,000,000,0001,000,000,00030

Because of this, binary search is $O(log_2 n)$.

Remember that we can only use binary search when the array is sorted. Also, we can only use binary search on arrays, not linked lists.

Copyright © 2020 Ian Finlayson | Licensed under a Creative Commons Attribution 4.0 International License.