This is an interview question. Design a class, which stores integers and provides two operations:
void insert(int k)
int getMedian()
I guess I can use BST so that insert
takes O(logN) and getMedian
takes O(logN) (for getMedian
I should add the number of of left/right children for each node).
Now I wonder if this is the most efficient solution and there is no better one.
You can use 2 heaps, that we will call Left
and Right
.
Left
is a Max-Heap
.
Right
is a Min-Heap
.
Insertion is done like this:
- If the new element
x
is smaller than the root of Left
then we insert x
to Left
.
- Else we insert
x
to Right
.
- If after insertion
Left
has count of elements that is greater than 1 from the count of elements of Right
, then we call Extract-Max on Left
and insert it to Right
.
- Else if after insertion
Right
has count of elements that is greater than the count of elements of Left
, then we call Extract-Min on Right
and insert it to Left
.
The median is always the root of Left
.
So insertion is done in O(lg n)
time and getting the median is done in O(1)
time.
See this Stack Overflow question for a solution that involves two heaps.
Would it beat an array of integers witch performs a sort at insertion time with a sort algorithm dedicated for integer (http://en.wikipedia.org/wiki/Sorting_algorithm) if you choose your candidate amongst O < O(log(n)) and using an array, then getMedian would be taking the index at half of the size would be O(1), no? It seems possible to me to do better than log(n) + log(n).
Plus by being a little more flexible you can improve your performance by changing your sort algorithm according to the properties of your input (are the input almost sorted or not ...).
I am pretty much autodidact in computer science, but that is the way I would do it: simpler is better.
You could consider a self-balancing tree, too. If the tree is fully balanced, then the root node is your median. Say, the tree is one-level deeper on one end. Then, you just need to know how many nodes are there in the deeper-side to pick the correct median.