I was recently asked this question in an interview. Even though I was able to come up the O(n²) solution, the interviewer was obsessed with an O(n) solution. I also checked few other solutions of O(n logn) which I understood, but O(n) solution is still not my cup of tea which assumes appointments sorted by start-time.
Can anyone explain this?
Problem Statement: You are given n appointments. Each appointment contains a start time and an end time. You have to retun all conflicting appointments efficiently.
Person: 1,2, 3, 4, 5
App Start: 2, 4, 29, 10, 22
App End: 5, 7, 34, 11, 36Answer: 2x1 5x3
O(n logn) algorithm: separate start and end point like this:
2s, 4s, 29s, 10s, 22s, 5e, 7e, 34e, 11e, 36e
then sort all of this points (for simplicity let's assume each point is unique):
2s, 4s, 5e, 7e, 10s, 11e, 22s, 29s, 34e, 36e
if we have consecutive starts without ends then it is overlapping: 2s, 4s are adjacent so overlapping is there
We will keep a count of "s" and each time we encounter it will +1, and when e is encountered we decrease count by 1.
Assuming you have some constraint on the start and end times, and on the resolution at which you do the scheduling, it seems like it would be fairly easy to turn each appointment into a bitmap of times it does/doesn't use, then do a counting sort (aka bucket sort) on the in-use slots. Since both of those are linear, the result should be linear (though if I'm thinking correctly, it should be linear on the number of time slots rather than the number of appointments).
At least if I asked this as an interview question, the main thing I'd be hoping for is the candidate to ask about those constraints (i.e., whether those constraints are allowed). Given the degree to which it's unrealistic to schedule appointments for 1000 years from now, or schedule to a precision of even a minute (not to mention something like a nanosecond), they strike me as reasonable constraints, but you should ask before assuming them.
The general solution to this problem is not possible in O(n).
At a minimum you need to sort by appointment start time, which requires O(n log n).
There is an O(n) solution if the list is already sorted. The algorithm basically involves checking whether the next appointment is overlapped by any previous ones. There is a bit of subtlety to this one as you actually need two pointers into the list as you run through it:
O(n) solutions for the unsorted case could only exist if you have other constraints, e.g. a fixed number of appointment timeslots. If this was the case, then you can use HashSets to determine which appointment(s) cover each timeslot, algorithm roughly as follows:
This is the best I can think of, in horrible pseudocode. I attempted to reduce the problem as much as possible. This is only less than On^2 (I think).
Note that the output at the end will not show every appointment that a given appointment will conflict with on that appointment's specific output line...but at some point every conflict is displayed.
Also note that I renamed the appointments numerically in order of starting time.
pseudocode
A naive approach might be to build two parallel trees, one ordered by the beginning point, and one ordered by the ending point of each interval. This allows discarding half of each tree in O(log n) time, but the results must be merged, requiring O(n) time. This gives us queries in O(n + log n) = O(n).
I came across a Data Structure called Interval tree, with the help of which we can find intervals in less than O(n log (n)) time, depending upon the data provided