I am writing a program that takes in a list of start and end times for farmers milking cows and determines both the longest time where >=1 cow is being milked and the longest time where no cows are being milk.
In it, I've tried using this function. It's an exercise on complete search, but this isn't fast enough when there's a lot of data (I think because there are n^2 iterations).
timesIS
is simply a list of the times in increasing start order, and timesDE
is a list of the same times by decreasing end. timeIndex
is the position from which to start. For the longest interval of milking, my program later does this for every index and returns the longest interval.
While still keeping to a complete search, how can I make this more efficient (switch to something closer to n passes, perhaps)?
def nextCease(TimesIS, timesDE, timeIndex):
latestTime = TimesIS[timeIndex][1]
for j in range (0, len(timesDE)):
for i in range (0, len(timesDE)):
if timesDE[i][0]<=latestTime and timesDE[i][1]>=latestTime:
latestTime = timesDE[i][1]
if latestTime == timesDE[0][1]:
return latestTime
break
return latestTime
Here's a small piece of data input (first line is just the number of farmers):
6
100 200
200 400
400 800
800 1600
50 100
1700 3200
I think this is a minimal, complete, and verifiable example:
from operator import itemgetter
times = [[100,200], [200,400], [400,800], [800,1600], [50,100], [1700,3200]
def nextCease(TimesIS, timesDE, timeIndex):
latestTime = TimesIS[timeIndex][1]
for j in range (0, len(timesDE)):
for i in range (0, len(timesDE)):
if timesDE[i][0]<=latestTime and timesDE[i][1]>=latestTime:
latestTime = timesDE[i][1]
if latestTime == timesDE[0][1]:
return latestTime
break
return latestTime
timesIS = sorted(times[:], key=itemgetter(0)) #increasing starttimes
timesDE = sorted(times[:], key=itemgetter(1), reverse=True) #decreasing endtimes
longestIntervalMilk = 0
for i in range (0, len(times)):
interval = nextCease(timesIS, timesDE, i) - timesIS[i][0]
if interval > longestIntervalMilk:
longestIntervalMilk = interval
longestIntervalNoMilk = 0
latestFinish = 0
for i in range (0, len(times)):
latestFinish = nextCease(timesIS, timesDE, i)
timesIS2 = timesIS[:]
while(timesIS2[0][0] < latestFinish):
nextStartExists = True
del timesIS2[0]
if timesIS2 == []:
nextStartExists = False
break
if nextStartExists == True:
nextStart = timesIS2[0][0]
longestIntervalNoMilk = nextStart - latestFinish
print(str(longestIntervalMilk) + " " + str(longestIntervalNoMilk) + "\n"))
EDIT: In the meantime, I wrote up this. It gives the wrong output for a very long list (it's 1001 lines so I won't reprint it here, but you can find it at http://train.usaco.org/usacodatashow?a=iA4oZAAX7KZ) and I'm confused as to why:
times = sorted(times[:], key=itemgetter(0))
def longestMilkInterval(times):
earliestTime = times[0]
latestTime = times[0][1]
interval = 0
for i in range (1, len(times)):
if times[i][1] > latestTime and times[i][0] <= latestTime:
if times[i][1] - earliestTime[0] > interval:
interval = times[i][1] - earliestTime[0]
latestTime = times[i][1]
else:
earliestTime = times[i]
latestTime = times[i][1]
print(earliestTime)
return interval
def longestNoMilkInterval(times):
earliestTime = times[0][1]
interval = 0
for i in range (0, len(times)):
if times[i][0] >= earliestTime:
if times[i][0] - earliestTime > interval:
interval = times[i][0] - earliestTime
break
else:
earliestTime = times[i][1]
return interval
Output should be 912 184
(>=1 cow, 0 cow).
Here is a very straightforward approach which does it in one pass, including a sort, so the complexity would be
O(n*logn)
.items
: It puts the start/end itervals into a list of tuples(start_time, typ)
so we can traverse the list and if we see as
a new cow is being milked ande
then a cow is stopped being milked. This way we can have a counternof_cows_milked
at any time which is the basis for getting the "max time 0 cows milked" and "max time 1+ cows milked"last_intersection_time
and compares the duration to the max duration ofmax_0_cows
andmax_1_plus_cows
. Again, this part is not very pretty, maybe there are more elegant ways to solve that.Your algorithm basically just checks for the longest interval of a single tuple, but doesn't check for overlapping or adjacent tuples.
Take these intervals as an visual example:
Your code just finds the interval G-H, whereas you need to find C-F. You somewhere need to keep track of how many cows are milked in parallel, so you need at least the
nof_of_cows_milked
counter as in my code example.