Write a program to find 100 largest numbers out of

2019-01-08 02:31发布

I recently attended an interview where I was asked "write a program to find 100 largest numbers out of an array of 1 billion numbers."

I was only able to give a brute force solution which was to sort the array in O(nlogn) time complexity and take the last 100 numbers.

Arrays.sort(array);

The interviewer was looking for a better time complexity, I tried a couple of other solutions but failed to answer him. Is there a better time complexity solution?

30条回答
孤傲高冷的网名
2楼-- · 2019-01-08 03:00

This question would be answered with N log(100) complexity (instead of N log N) with just one line of C++ code.

 std::vector<int> myvector = ...; // Define your 1 billion numbers. 
                                 // Assumed integer just for concreteness 
 std::partial_sort (myvector.begin(), myvector.begin()+100, myvector.end());

The final answer would be a vector where the first 100 elements are guaranteed to be the 100 biggest numbers of you array while the remaining elements are unordered

C++ STL (standard library) is quite handy for this kind of problems.

Note: I am not saying that this is the optimal solution, but it would have saved your interview.

查看更多
Bombasti
3楼-- · 2019-01-08 03:01

I would find out who had the time to put a billion numbers into an array and fire him. Must work for government. At least if you had a linked list you could insert a number into the middle without moving half a billion to make room. Even better a Btree allows for a binary search. Each comparison eliminates half of your total. A hash algorithm would allow you to populate the data structure like a checkerboard but not so good for sparse data. As it is your best bet is to have a solution array of 100 integers and keep track of the lowest number in your solution array so you can replace it when you come across a higher number in the original array. You would have to look at every element in the original array assuming it is not sorted to begin with.

查看更多
不美不萌又怎样
4楼-- · 2019-01-08 03:02

Another O(n) algorithm -

The algorithm finds the largest 100 by elimination

consider all the million numbers in their binary representation. Start from the most significant bit. Finding if the MSB is 1 can be a done by a boolean operation multiplication with an appropriate number. If there are more than 100 1's in these million eliminate the other numbers with zeros. Now of the remaining numbers proceed with the next most significant bit. keep a count of the number of remaining numbers after elimination and proceed as long as this number is greater than 100.

The major boolean operation can be an parallely done on GPUs

查看更多
啃猪蹄的小仙女
5楼-- · 2019-01-08 03:05

An very easy solution would be to iterate through the array 100 times. Which is O(n).

Each time you pull out the largest number (and change its value to the minimum value, so that you don't see it in the next iteration, or keep track of indexes of previous answers (by keeping track of indexes the original array can have multiple of the same number)). After 100 iterations, you have the 100 largest numbers.

查看更多
对你真心纯属浪费
6楼-- · 2019-01-08 03:05
  1. Use nth-element to get the 100'th element O(n)
  2. Iterate the second time but only once and output every element that is greater than this specific element.

Please note esp. the second step might be easy to compute in parallel! And it will also be efficiently when you need a million biggest elements.

查看更多
姐就是有狂的资本
7楼-- · 2019-01-08 03:06

You can use Quick select algorithm to find the number at the(by order) index [billion-101] and then iterate over the numbers and to find the numbers that biger from that number.

array={...the billion numbers...} 
result[100];

pivot=QuickSelect(array,billion-101);//O(N)

for(i=0;i<billion;i++)//O(N)
   if(array[i]>=pivot)
      result.add(array[i]);

This algorithm Time is: 2 X O(N) = O(N) (Average case performance)

The second option like Thomas Jungblut suggest is:

Use Heap building the MAX heap will take O(N),then the top 100 max numbers will be in the top of the Heap, all you need is to get them out from the heap(100 X O(Log(N)).

This algorithm Time is:O(N) + 100 X O(Log(N)) = O(N)

查看更多
登录 后发表回答