Binary search to find last element in sorted list

2019-09-01 07:09发布

问题:

I am searching through a dictionary of messages, that contain unixtimes, with length N, where I want to find maximum number of messages (I call this the frequency) that is inside an arbitrary 24 hour (86400 seconds) time slot. That means that if there are five messages with an unixtime within 24 hours of one I want 5.

I want to accomplish this with binary search, but I am a little bit in the wild on how I can implement that as best, and if I can use some binarysearch library.

This is how I do it with a search grid of 10 elements:

        cur.execute('SELECT unixtime FROM MessageType1 WHERE userID ='+str(userID[index])+' ORDER BY unixtime asc')
        AISmessages = cur.fetchall()
        AISmessages = {index:x[0] for index,x in enumerate(AISmessages)}
for nextMessageIndex in range(messageIndex+1, len(AISmessages),10):
    if  AISmessages[nextMessageIndex] < message+(86400):
    #Count the number of occurences
        frequency += 10
    elif AISmessages[nextMessageIndex-5] < message+(86400):
        if AISmessages[nextMessageIndex-2] < message+(86400):
            if AISmessages[nextMessageIndex-1] < message+(86400):
                frequency += 9
            else:
                frequency += 8
        elif AISmessages[nextMessageIndex-3] < message+(86400):
            frequency += 7
        elif AISmessages[nextMessageIndex-4] < message+(86400):
            frequency += 6
        else:
            frequency += 5
    elif AISmessages[nextMessageIndex-7] < message+(86400):
        if AISmessages[nextMessageIndex-6] < mssage+(86400):
            frequency += 4
        else:
            frequency += 3
    elif AISmessages[nextMessageIndex-9] < message+(86400):
        if AISmessages[nextMessageIndex-8]< message+(86400):
            frequency += 2
        else:
            frequency += 1
    else:
        break

I think I've screwed up this one as well, but I cannot find out how - I know it is no good when the length of AISmessages isnt divisible by 10 f.ex

How would I standarize this to a binary search that gives me the frequency of the messages inside a 24 hour timeslot in a dictionary with any number of elements?

回答1:

You can use bisect from the standard library. I'm not sure if I understood your problem correctly, but a solution may look something like this:

frequency = bisect(AISmessages[messageIndex:], message+86400)

Example: This gives you the number of items in the list a with values in a range of 30, starting from the entry with index 2 (assuming a is sorted):

>>> a = [4, 17, 31, 39, 41, 80, 82, 85, 86, 96]
>>> i = 2
>>> m = a[i] # 31
>>> bisect(a[i:], m+30)
3 # correct: 31, 39, 41