find minimum-length subarray that has all numbers

2020-06-04 04:17发布

问题:

File input.txt consists of two lines: first has integer number N space then integer number K (1 ≤ N,K ≤ 250000). Second has N space-delimeted integers, where each integer is less than or equal to K. It is guaranteed that each integer from 1 to K is in the array. The task is to find subarray of minimum length, that contains all integers. And print its start and end. Note, that indexing starts from 1.

Examples:

Input         Output
5 3           2 4
1 2 1 3 2

6 4           2 6
2 4 2 3 3 1  

I had this task in a recent programming competition. It is over, and I am not cheating. I've implemented it using python 3:

with open('input.txt') as file:
    N, K = [int(x) for x in file.readline().split()]
    alley = [int(x) for x in file.readline().split()]

trees = {}
min_idx = (1, N)
min_length = N
for i in range(N):
    trees[alley[i]] = i
    if len(trees) == K:
        idx = (min(trees.values())+1, max(trees.values())+1)
        length = idx[1] - idx[0] + 1
        if length < min_length:
            min_idx = idx
            min_length = length
        if min_length == K:
            break


print (str(min_idx[0]) + " " + str(min_idx[1]))

The idea is to save last position of i-th tree into a dictionary and if dictionary contains all items, check if this subarray is minimum.

16th test showed that my algorithm exceeded time limit, which was 1 second. I think, that my algorithm is O(N), because it finishes in one run across array, and map access costs O(1).

How can one speed up this algorithm? Can be complexity reduced or is it my misunderstanding of some Python which takes much time?

回答1:

Your algorithm is good but ignoring the time that len(trees) < K, it's O(NK) because every call to min and max is O(K). There's no need to call max because max(trees.values()) == i. Dealing with min is trickier, but if you keep track of which key corresponds to the minimum index then you can recalculate it only when that key is updated.

A minor point is that your last if doesn't always need to be checked.

Overall:

trees = {}
min_idx = (1, N)
min_length = N
first_index = -1
for i in range(N):
    trees[alley[i]] = i
    if len(trees) == K:
        if first_index == -1 or alley[first_index] == alley[i]:
            first_index = min(trees.values())
        idx = (first_index+1, i+1)
        length = idx[1] - idx[0] + 1
        if length < min_length:
            min_idx = idx
            min_length = length
            if min_length == K:
                break


回答2:

Make integer array Counts[K], fill with zeros.
Keep some variables - left index L, right index R (like your idx[0] and idx[1]), zero count Z.
Set L and R to 1, increment Counts[A[1]], set Z to K-1

Move R, incrementing Counts[A[1]], and decrement Z if zero entry is updated, until Z becomes 0
At this moment subarray [L..R] contains all values from to K

Now move L, decrementing Counts entry for values leaving the window. Increment Z if some entry becomes 0. When Z becomes non-zero, stop moving L and move R again.

When R reaches N and L stops, process is over. Minimum length is minimum from valid (R-L+1) pairs

Example run for your [1 2 1 3 2]

Move R
1 0 0  Z=2
1 1 0  Z=1
2 1 0  Z=1
2 1 1  Z=0
Move L
1 1 1  Z=0
1 0 1  Z=1 Stop moving L, check previous L,R pair 2,4
Move R
1 1 1  Z=0
move L
9 1 1  Z=1 Stop moving L, check previous L,R pair 3,5