Optimization of Algorithm [closed]

2019-04-12 21:04发布

问题:

Here is the link to the problem.

The problem asks the number of solutions to the Diophantine equation of the form 1/x + 1/y = 1/z (where z = n!). Rearranging the given equation clearly tells that the answer is the number of factors of z2.

So problem boils to find the number of factors of n!2.

My algorithm is as follows

  1. Make a boolean look up table for all primes <= n using Sieve of Eratosthenes algorithm.
  2. Iterate over all primes P <= n and find its exponent in n!. I did this using step function formula. Let the exponent be K, then the exponent of P in n!2 will be 2K.
  3. Using step 2 calculate number of factors with the standard formula.

For the worst case input of n = 106, my c code gives answer in 0.056s. I guess the complexity is no way greater than O(n lg n). But when I submitted the code on the site, I could pass only 3/15 test cases with the verdict on the rest as time limit exceeded.

I need some hint for optimizing this algorithm.

Code so far:

#include <stdio.h>
#include <string.h>

#define ULL unsigned long long int
#define MAXN 1000010
#define MOD 1000007

int isPrime[MAXN];

ULL solve(int N) {
    int i, j, f;
    ULL res = 1;
    memset(isPrime, 1, MAXN * sizeof(int));
    isPrime[0] = isPrime[1] = 0;
    for (i = 2; i * i <= N; ++i)
        if (isPrime[i])
            for (j = i * i; j <= N; j += i)
                isPrime[j] = 0;
    for (i = 2; i <= N; ++i) {
        if (isPrime[i]) {
            for (j = i, f = 0; j <= N; j *= i)
                f += N / j;
            f = ((f << 1) + 1) % MOD;
            res = (res * f) % MOD;
        }
    }
    return res % MOD;
}

int main() {
    int N;
    scanf("%d", &N);
    printf("%llu\n", solve(N));
    return 0;
}

回答1:

You have an int overflow here:

for (j = i, f = 0; j <= N; j *= i)

If 46340 < i < 65536 and for many larger i, in the second iteration j will be negative if overflow wraps around (it is undefined behaviour, so anything could happen).

Replace it with e.g.

for(j = N / i, f = 0; j > 0; j /= i) {
    f += j;
}

It is, however, unlikely that the extra iterations due to the overflow would cause a "time limit exceeded", that will likely only cause wrong results.

So the generic advice is

  • Sieve only odd numbers, perhaps also eliminate multiples of 3 from the sieve. Eliminating the odd numbers from the sieve roughly halves the time needed to sieve.
  • Don't use an entire int for the flags, use bits or at least chars. That gives better cache locality and a further speedup.