-->

Determine which items in an array are part of long

2019-06-03 23:24发布

问题:

This is a variant of the longest increasing subsequence problem. Suppose that, instead of wanting to find a sequence or counting how many sequences there are, you wanted to identify items that can be part of some longest increasing sequence.

For example, in the list 1,2,4,3,0,5 all items except the zero can be part of a longest increasing subsequence.

What is a strategy to find these items? How efficiently can it be done?

回答1:

One method is to use the same dynamic algorithm as in the longest increasing subsequence problem, keeping track of the best item to precede the i'th item assuming it was the last item, but tweak it to keep track of ties. Then, after the best preceding items are known for every item, determine which items are reachable via that graph when starting from the top scoring items.

In the case of the example, 1,2,4,3,0,5, it would work like this:

  • 1 is at index 0, so give it a lis score of 1 with no previous index
  • 2 can be preceded by 1, so 2 gets a lis score of 1+1=2 and a previous index of 0
  • 4 can be preceded by 2 and 1, but 2 has a better lis score so 4 gets a lis score of 2+1=3 and a previous index of 1
  • 3 can be also preceded by 2 and 1, and again 2 has a better lis score so 3 gets a lis score of 2+1=3 and a previous index of 1
  • 0 can't be preceded by anything, so it gets a lis score of 1 with no previous index
  • 5 can be preceded by any of the other items, but 3 and 4 have the best lis scores (=3) so 5 gets a lis score of 3+1=4 with previous indexes of 2 or 3.

Now we take the top scoring items, just 5 in this case, and iteratively search backwards for items that can precede it. This will mark 5, 3, 4, 2 and 1 (but not 0) as can-be-in-longest-sequence.

This definitely runs in O(n^2) time, and I think by being clever it can be made to work in O(n lg n) time.

The main challenge for going from quadratic time is not making an explicit graph with 'can optimally precede' edges. A list like 1,1,1,1,1,1,...,2,2,2,2,2,2,... has a quadratic number of edges, so you need to avoid storing them all or exploring them all.



标签: algorithm lis