Simple “maximum value in array” and complexity cal

2019-03-27 18:18发布

问题:

I'm pretty new to this stuff and I need your help.

I should build an efficient simple algorithm which returns the maximum value in an array with size of n which contains the numbers 1,2,...n with repetitions.

Then I have to determine the best running time, average running time and worst running time.

So I have two questions:

  1. First of all I'm trying to understand what's the idea of requiring an efficient solution for this simple algorithm. As far as I understand I should only have a simple loop from 1 to n and look for the maximum value. Is the "efficient" algorithm points out that If I find the value n in the array I can stop searching more values because this is the maximum value in the array?

  2. How do I determine the best running time and average running time using the fact, while calculating the average running time, that It's a uniform distribution. That is, each cell in the array has a chance of 1/n to be the maximum value.

Thanks a lot in advance!

回答1:

Best case - finding the max element as the first (O(1)), worst case - it is the last element checked (O(n)).

The tricky part is the average case.
To find the average case - we need the expected number of iterations!

Since you can stop after you find the maximum, we can split the problem into two parts:

  1. [0,n-1): Since on average (assuming uniform independent distribution for each element) - the number n has probability 1/n to be in each place, then the expected number of iterations for this part is 1/n + 2*((n-1)/n)/n + 3 * ((n-1)/n)^2/n + ... + (n-1) * ((n-1)/n)^(n-2)/n 1
    The above formula yields an ugly formula which is O(n)
  2. The last element is going to be checked if the first n-1 elements did not contain the value n: so you need to add to the above n* ((n-1)/n)^(n-1), which is O(n) as well (lim to infinity is 1/e * n).

This totals in O(n) average time solution.


(1): The formula for each element is j*((n-1)/n)^(j-1) * (1/n) because:

  • j - for the number of elements checked (number of iterations)
  • ((n-1)/n)^(j-1) - Probability that there is no n in the previous elements
  • (1/n) - Probability this number is n.


回答2:

The algorithm works like this, first you select a number (in this case i select the first number of the array and pretend it is the max, then i compare it to the next number and if it is larger I take that as the new max, until i finish searching in the array), the next code is in C:

#include <stdio.h>
#define SIZE 100

typedef struct{
    int val;
    int loc;
} find;

/* Functions declaration (Prototype) */
find maxFinder( int * const a );

int main( void )
{
    int response[ SIZE ]=
       { 1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100,
         1, 3, 5, 7,  8, 9, 0, 10, 65, 100 };

    printf( "The max number is %d located in the index %d.\n", maxFinder( response ).val, maxFinder( response ).loc );
    return 0;
}

 find maxFinder( int * const a )
 {
    /* Local Variables initialization & declaration */
    int i;
    find result;

    result.loc = 0;
    result.val = *( a + 0 );

    for( i = 1; i < SIZE; i++ ){
        if ( result.val < *( a + i ) ){
            result.loc = i;
            result.val = *( a + i );
        }
    }
    return result;
 }


回答3:

If there is no prior information about the array (e.g. it's sorted), then there is no worst case or best case, and You have to scan all the elements to find out the Max, and it takes O(n) times.

Also, knowing the probability distribution of getting max value for each cell is useless in general (unless it reduces your search space. e.g., If you know that only constant number of cells have non-zero probability of getting the max value, then you just need to search those cells and it takes constant time). Thus, in general

Best-case running time = Worst-case running time = average running time = O(n)



回答4:

The worst and best cases are simple. The average case is more interesting. Look at the Wikipedia page for Geometric Distribution.