13 February 2010

Given an array find any pair that sums up into a number

This problem is quite largely discussed and is very famous for its various ways to attack. This can be constrained in time and in space. The most primitive way of attacking this problem would yield an solution that runs in O(n2) time.
Let me define the problem clearly. Given an array of size n and a number X, you are supposed to find a pair of numbers that sums to X. Only one pair would be enough.
Lets see how we can find the solution for this in O(n) using a HashSet. Yes, HashSet is a costly data structure to use, but lets consider this primitive just due to the linear order it provides.

We shall improvise on this problem in space domain using fancy sorts and those we shall see in the coming posts.



Arun said...

Considering that the members are not ordered, the best bet would be to use a hash based datastructure. I am not sure why you generalize HashSet as expensive. List or an array would do horribly when doing a linear search with this data (as you know, binary search is impossible here). I cant think of any other datastructure you could fit in. Also, the performance of HashSet depends on the efficient implementation of the hash. So, there could be a cost you would pay for larger datasets but i am sure that would be better than linear search any day.

BragBoy said...

@Arun: I completely accept your argument. The reason i want to solve the problem without using Hash is because that is what interviewers ask :) these days. They tell you not to solve using Hash. And they may say it should not be O(n*n) either.

ifanchu said...

You can easily tackle this question in O(n lg n) time by first sorting the array using in-place comparison sort like quick sort. After sorting, you can have two pointers pointed to the head and the tail, then find the sum.

vijay said...

it might be a good idea to use BST here. create a BST of n numbers (nlogn). then, for each number in an array (of size n), subtract it from TargetSum and then look for the other number in binary tree (logn) which should take (nlogn) total time ... the overall time would (nlogn + nlogn) which is nlogn... please correct me if I am wrong.

BragBoy said...

@vijay : yes that is a very good idea to solve the problem!!