How does binary search work, anyway?

Unrelated photo by Matthew Weibe to make this post look nice in social media shares.

Or, yet another blog post idea I stole from that article about programming interview concepts. You can find the rest of those posts under the how does it work? tag.

Binary search is an extremely simple idea that’s useful for much more than finding an element in a sorted array. The way binary search works is you compare the item you want to find with the item in the middle of the array, then whichever “side” of the midpoint of the array your element falls on, you compare it to the midpoint of that half until you find the item you want.

According to wikipedia a binary search makes at worst O(log₂ n) comparisons, which is pretty great when you have a large array. Because the search halves the search space each iteration, the maximum number of iterations you’ll need is the number of times you can divide your array in half. For an array of 100 items, you should only need 7 searches, and for an array of 200 items, you only need 8, and for 1000 items, only 10 searches. See how slowly the number of searches grows as the array gets a lot larger? That’s just cool!

Also cool: you can use that concept for more than just finding stuff in arrays. Back in college they taught us to narrow down where a bug in your code is using a binary search. Basically, comment out half of your code and see if the bug still happens. If it doesn’t, uncomment half of the commented half. If it does, comment out another half of the uncommented half. It always felt weird to do that, like I should’ve just been able to see the problem by reading the code, but it worked. If you’re new or just totally stumped, give it a try.

Of course, binary search isn’t the only way to find things. Hash maps can be even faster, but all they can tell you is whether your target item exists or not. If you want to return the next largest or smallest item in the event that you don’t find an exact match, hash maps are no help at all. There are binary search trees too (which I’ll go into more detail on in a future post), but ironically even though they’re named binary search trees, binary searching an array is usually faster. The problem with binary trees is it’s hard to keep them perfectly balanced, so you might have more “halves” on one side than the other which messes with your search efficiency. What binary trees are good for is quick updates – it can be a real hassle to add or remove an item from an array, trees are much easier to work with. There’s also plain old linear search, where you start at one end of your array and look at every item until you find the one you want. If you’re not going to search your array enough times for it to be worth sorting it, linear search is good enough.

That kind of tradeoff can actually make algorithm questions interesting. I still don’t care even a little bit whether you looked up how to reverse a binary tree before the interview, but I care a lot if you think to ask if you’re going to search that array enough times to recoup the cost of sorting it. Programmers, myself included, can be terrible about overengineering to solve problems that don’t actually exist. But more on that in another post!

melreams.com

How does binary search work, anyway?

Leave a Reply Cancel reply