You are given
n
pairs of numbers. In every pair, the first number is always smaller than the second number. A pair(c,d)
can follow(a,b)
if and only ifb
is less thanc
. Chains of pairs can be formed in this manner. Find the longest chain of pairs formed.
I got this question in an interview with Amazon but couldn't figure out the answer, just that it is related to the LIS problem.
It seems to me that one characteristic of the longest chain is that it minimizes the total span of any two adjacent tuples (to maximize the number of possible tuples for any range).
In that case, as Brian Stephens suggested, we need only sort by the second number, ascending (to guarantee the smallest span when searching for the next tuple), and build the chain starting with the first tuple in the sorted list.
We can prove that this solution is optimal by positing that if S(0) is the chain starting on the first tuple, there exists an S(i), where i is not in S(0), that is longer than S(0); and therefore also a chain S(i+1) that is longer than S(1). We know that the start of i must be before the end of i-1 or it would have been included in S(0). Now if the end of i-1 were equal to the end of i, S(i+1) would be part of S(0), which contradicts our premise. Alternatively, if the end of i were greater than the end of i-1, again S(i+1) would be included in S(0). The end of i could not be less than the end of i-1 since the list is sorted by this property.
First, this problem can be viewed as plotting points on a two dimensional grid where x co-ordinate of each point is less than y co-ordinate of that point. Now you're asked to find the longest chain such that every i+1 th point is both above and to the right of the ith point (and the y co-ordinate of the ith point is less than x co-ordinate of i+1th point).
now let's sort the points first based on their x co-ordinate. Then we start visiting the sorted points from lowest x co-ordinate, if there's a tie for x co-ordinate then we'll visit points with higher y co-ordinate first. now for an example if we're working on the ith point, we know that all points already visited have lower or similar x co-ordinate of the ith one. So the longest chain end at point i will be the longest chain that we've already got with y co-ordinate<=x co-ordinate of the current point.
We can do it efficiently with an efficient data structure like segment tree or Binary indexed tree.
Runtime of this solution is : O(NlgN) where N is the number of points (pairs in the given problem).
We can start with Brian's approach, of sorting all the pairs based on y value. It will ensure the required condition of prev pair's X < next pair's y. Now all we need is longest increasing subsequence of x among the new list of pairs. Which we can get by sorting all x and finding longest common subsequence between sorted x and new list previously created
Because the two values of each (X,Y) pair must be in order (X < Y), and the Y value of one pair must be less than the X value of the next pair, you are essentially building one continuous chain of values along one dimension. You just need to sort by the Y values.
Given this sample data:
Sort by Y to get:
Then loop through, adding a pair to the chain if its X is greater than the last Y in the chain, and ignoring it otherwise.
Notice that in this example,
(6,8)
happened to be sorted before(5,8)
. It doesn't matter which pair is chosen for the chain when their Y values are equal, as long as the X value satisfies the condition of being greater than the previous Y.Here's a diagram showing the sorted pairs as bars above a number line. Starting at the first pair,
(2,4)
, each one that is added to the chain at the bottom is colored yellow. Visually, the gray ones are skipped because there's a yellow one "in the way" of it being dropped into the chain (overlapping values).Proof: The only way to include more pairs in the chain is to replace a pair with one with a smaller Y value, to allow for the next pair to have a smaller X value, potentially fitting another pair where it couldn't fit before. If you replace a pair with one with the same Y value, you gain nothing. If the replacement has a larger Y value, all you've done is potentially block some pairs that would've fit before.
Because the pairs have been sorted by Y values, you will never find a replacement with a smaller Y. Looking "forward" in the sorted pairs, they will all have the same or greater Y value. Looking "backward", any that were eliminated from the chain initially were because the X value wasn't high enough, which will still be the case.
A basic way to solving this problem would be to create a tree, where each node is a pair, and construct directed edges from one node to another when it is legal (ie, "A pair (c,d) can follow (a,b) if and only if b is less than c"). You can do a modified breadth first traversal from each node that keeps track of the length of the longest path from that node, and when you are finished with that can just look over all the nodes to find the longest path.