Why I want to write this article?

The 3 variants of binary search algorithm confuses me frequently? Why do I need to set right = size or right = size -1? Why is the termination condition is left < right or left ≤ right?

So in this article I want to explore a little bit about the reason behind 3 main variant of binary search algorithm.

- the classic binary search algorithm

In this algorithm, we want to find exactly the index of the target in the sorted saerch array.

The (right + 1) return value means what is the index of the target that should be inserted into the array. Or we can return -1 if we didn’t find the target.

The termination condition of this binary search variant is (left > right).

int binary_search(int arr[], int len, int target){

left = 0;

right = size - 1;while(left <= right) {

int mid = left + (right - left) / 2;

if(arr[mid] == target) return mid;

else if(arr[mid] > target) {

right = mid - 1;

} else {

left = mid + 1;

} }

return right + 1; }

2. The second one is quite useful.

we cannot just return the mid index once we find the index that satisfy our target or that is equal to our target.

What happen in this method is that there may be multiple elements that satisfy our condition, and we want to find the element that is closest to our target.

So we set our termination condition to (left < right). And we may not just immediately return the mid index once we get the index that satifies our condition.

the element the mid points to is still under consideration. Here is the code.

int binary_search(int arr[], int len, int target){int left = 0;int right = len;while(left < right) {int mid = left + (right - left) / 2;if(arr[mid] == target) return mid;else if(arr[mid] > target) left = mid + 1;else right = mid;}return left;}

For example.

Bad version. Suppose we have a project and every time we have an project update we will have to do a test on the project. Given a project test status array, we want to find out the earliest version at which the project didn’t pass the test.

For example, given test status array [1, 0, 0, 0, 0, 0, 0].

Starting from index 0. the first version that fails the test is index 4.

So begin with left = 0, right = 6. The first mid index is (6 + 0) / 2 = 3. And a[mid] == 0. But we couldn’t just return this index since it is not the earliest version that fails the test. We still need too move left until stop at the point where (left index == right index).

Here is the code

int find_first_bad_version(int arr[], int len) {int left = 0;int right = len - 1;while(left < right) {int mid = (left + right) / 2;if(arr[mid] == 0) right = mid;else left = mid + 1;}return left;}

Why this algorithm is correct?

The second example is that given multiple files where each file contains sorted keys and the files are arranged in sorted order and they have non-overlapping key ranges.

For example, given 3 files in sorted order. a1 = [1 , 3, 5, 11], a2 = [13, 15, 27, 30], a3 = [35, 38, 39, 59]

if we want to find out

multiple files that may contain the search key.