I was recently asked this question in an interview. Even though I was able to come up the O(n²) solution, the interviewer was obsessed with an O(n) solution. I also checked few other solutions of O(n logn) which I understood, but O(n) solution is still not my cup of tea which assumes appointments sorted by start-time.
Can anyone explain this?
Problem Statement: You are given n appointments. Each appointment contains a start time and an end time. You have to retun all conflicting appointments efficiently.
Person: 1,2, 3, 4, 5
App Start: 2, 4, 29, 10, 22
App End: 5, 7, 34, 11, 36
Answer: 2x1 5x3
O(n logn) algorithm: separate start and end point like this:
2s, 4s, 29s, 10s, 22s, 5e, 7e, 34e, 11e, 36e
then sort all of this points (for simplicity let's assume each point is unique):
2s, 4s, 5e, 7e, 10s, 11e, 22s, 29s, 34e, 36e
if we have consecutive starts without ends then it is overlapping:
2s, 4s are adjacent so overlapping is there
We will keep a count of "s" and each time we encounter it will +1, and when e is encountered we decrease count by 1.
The general solution to this problem is not possible in O(n).
At a minimum you need to sort by appointment start time, which requires O(n log n).
There is an O(n) solution if the list is already sorted. The algorithm basically involves checking whether the next appointment is overlapped by any previous ones. There is a bit of subtlety to this one as you actually need two pointers into the list as you run through it:
- The current appointment being checked
- The appointment with the latest end time encountered so far (which might not be the previous appointment)
O(n) solutions for the unsorted case could only exist if you have other constraints, e.g. a fixed number of appointment timeslots. If this was the case, then you can use HashSets to determine which appointment(s) cover each timeslot, algorithm roughly as follows:
- Create a HashSet for each timeslot - O(1) since timeslot number is a fixed constant
- For each appointment, store its ID number in HashSets of slot(s) that it covers - O(n) since updating a constant number of timeslots is O(1) for each appointment
- Run through the slots, checking for overlaps - O(1) (or O(n) if you want to iterate over the overlapping appointments to return them as results)
Assuming you have some constraint on the start and end times, and on the resolution at which you do the scheduling, it seems like it would be fairly easy to turn each appointment into a bitmap of times it does/doesn't use, then do a counting sort (aka bucket sort) on the in-use slots. Since both of those are linear, the result should be linear (though if I'm thinking correctly, it should be linear on the number of time slots rather than the number of appointments).
At least if I asked this as an interview question, the main thing I'd be hoping for is the candidate to ask about those constraints (i.e., whether those constraints are allowed). Given the degree to which it's unrealistic to schedule appointments for 1000 years from now, or schedule to a precision of even a minute (not to mention something like a nanosecond), they strike me as reasonable constraints, but you should ask before assuming them.
A naive approach might be to build two parallel trees, one ordered by the beginning point, and one ordered by the ending point of each interval. This allows discarding half of each tree in O(log n) time, but the results must be merged, requiring O(n) time. This gives us queries in O(n + log n) = O(n).
This is the best I can think of, in horrible pseudocode. I attempted to reduce the problem as much as possible. This is only less than On^2 (I think).
Note that the output at the end will not show every appointment that a given appointment will conflict with on that appointment's specific output line...but at some point every conflict is displayed.
Also note that I renamed the appointments numerically in order of starting time.
output would be something like the following:
Appointment 1 conflicts with 2
Appointment 2 conflicts with
Appointment 3 conflicts with
Appointment 4 conflicts with 5
Appointment 5 conflicts with
appt{1},appt{2},appt{3} ,appt{4} ,appt{5}
2 4 10 22 29
5 7 11 36 34
pseudocode
list=(1,2,3,4,5)
for (i=1,i<=5,i++)
list.shift() **removes first element
appt{i}.conflictswith()=list
for (i=1,i<=n,i++)
{ number=n
done=false
while(done=false)
{if (number>i)
{if (appt(i).endtime() < appt(number).startime())
{appt{i}.conflictswith().pop()}
else
{done=true}
number--
}
else
{done=true}
}
}
for (i=1,i<=n,i++)
print "Appointment ",i," conflicts with:",appt{i}.conflictswith()
I came across a Data Structure called Interval tree, with the help of which we can find intervals in less than O(n log (n)) time, depending upon the data provided