可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Someone asked me a brainteaser, and I don't know; my knowledge slows down after amortized analysis, and in this case, this is O(n).
public int findMax(array) {
int count = 0;
int max = array[0];
for (int i=0; i<array.length; i++) {
if (array[i] > max) {
count++;
max = array[i];
}
}
return count;
}
What's the expected value of count
for an array of size n?
Numbers are randomly picked from a uniform distribution.
回答1:
Let f(n) be the average number of assignments.
Then if the last element is not the largest, f(n) = f(n-1).
If the last element is the largest, then f(n) = f(n-1) + 1.
Since the last number is largest with probability 1/n
, and not the largest with probability (n-1)/n
, we have:
f(n) = (n-1)/n*f(n-1) + 1/n*(f(n-1) + 1)
Expand and collect terms to get:
f(n) = f(n-1) + 1/n
And f(1) = 0. So:
f(1) = 0
f(2) = 0 + 1/2
f(3) = 0 + 1/2 + 1/3
f(4) = 0 + 1/2 + 1/3 + 1/4
That is, f(n) is the n_th "Harmonic number", which you can get in closed form only approximately. (Well, one less than the n_th Harmonic number. The problem would be prettier if you initialized max
to INT_MIN
and just let the loop run, so that f(1) = 1.)
The above is not a rigorous proof, since I was sloppy about expected values versus actual values. But I believe the answer is right anyway :-).
回答2:
I would like to comment on Nemo's answer, but I don't have the reputation to comment. His correct answer can be simplified:
The chance that the second number is larger than the first is 1/2. Regardless of that, the chance that the 3rd number is larger than two before, is 1/3. These are all independent chances and the total expectation is therefore
1/2 + 1/3 + 1/4 + .. + 1/n
回答3:
You can actually take this analysis a step further when the value of each item comes from a finite set. Let E(N, M) be the expected number of assignments when finding the max of N elements that come uniformly from an alphabet of size M. Then we can say...
E(0, M) = E(N, 0) = 0
E(N, M) = 1 + SUM[SUM[E(j, i) * (N - 1 Choose j) * ((M - i) / M)^(N-j-1) * (i / M) ^ j : j from 0 to N - 1] : i from 0 to M - 1]
This is a bit hard to come up with a closed form for but we can be sure that E(N, M) is in O(log(min(N, M))). This is because E(N, INF) is in THETA(log(N)) as the harmonic series sum grows proportional to the log function and E(N, M) < E(N, M + 1). Likewise when M < N we have E(N, M) < E(M, INF) as there is at M unique values.
And here's some code to compute E(N, M) yourself. I wonder if anyone can get this to a closed form?
#define N 100
#define M 100
double NCR[N + 1][M + 1];
double E[N + 1][M + 1];
int main() {
NCR[0][0] = 1;
for(int i = 1; i <= N; i++) {
NCR[i][0] = NCR[i][i] = 1;
for(int j = 1; j < i; j++) {
NCR[i][j] = NCR[i - 1][j - 1] + NCR[i - 1][j];
}
}
for(int n = 1; n <= N; n++) {
for(int m = 1; m <= M; m++) {
E[n][m] = 1;
for(int i = 1; i < m; i++) {
for(int j = 1; j < n; j++) {
E[n][m] += NCR[n - 1][j] *
pow(1.0 * (m - i) / m, n - j - 1) *
pow(1.0 * i / m, j) * E[j][i] / m;
}
}
}
}
cout << E[N][M] << endl;
}
回答4:
I am assuming all elements are distinct and counting the initial assignment to max
outside the for loop.
If the array is sorted in increasing order, the variable max
gets assigned to exactly n times (each time it gets a greater value).
If the array is sorted in decreasing order, the variable max
gets assigned to exactly once (it gets assigned the first time and all subsequent values are smaller).
Edit:
My formulation for a randomly permuted array was actually wrong, as pointed out in the comments. I think @Nemo posts the correct answer to this.
I think just counting the number of assignments is not really a true measure of the cost of this function. whether or not we actually update the value of max
, we are actually comparing it exactly n times. So, fewer assignments does not really imply less work done.
Also observe that there are actually no swaps being done. Only assignments and comparisons.