Find the sum of the digits of all the numbers from

2019-06-24 20:05发布

问题:

This question already has an answer here:

  • Sum of Digits till a number which is given as input 2 answers

Problem: Find the sum of the digits of all the numbers from 1 to N (both ends included)

Time Complexity should be O(logN)

For N = 10 the sum is 1+2+3+4+5+6+7+8+9+(1+0) = 46

For N = 11 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) = 48

For N = 12 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) +(1+2)= 51

This recursive solution works like a charm, but I'd like to understand the rationale for reaching such a solution. I believe it's based on finite induction, but can someone show exactly how to solve this problem?

I've pasted (with minor modifications) the aforementioned solution:

static long Solution(long n)
{
    if (n <= 0)
        return 0;
    if (n < 10)
        return (n * (n + 1)) / 2; // sum of arithmetic progression
    long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
    long y = long.Parse(n.ToString().Substring(1)); // remaining digits
    int power = (int)Math.Pow(10, n.ToString().Length - 1);

    // how to reach this recursive solution?
    return (power * Solution(x - 1))
    + (x * (y + 1)) 
    + (x * Solution(power - 1)) 
    + Solution(y);
}

Unit test (which is NOT O(logN)):

    long count = 0;
    for (int i=1; i<=N; i++)
    {
        foreach (var c in i.ToString().ToCharArray())
            count += int.Parse(c.ToString());
    }

Or:

Enumerable.Range(1, N).SelectMany(
    n => n.ToString().ToCharArray().Select(
        c => int.Parse(c.ToString())
    )
).Sum();

回答1:

This is actually a O(n^log10(2))-time solution (log10(2) is approximately 0.3). Not sure if that matters. We have n = xy, where I use concatenation to denote concatenation, not multiplication. Here are the four key lines with commentary underneath.

return (power * Solution(x - 1))

This counts the contribution of the x place for the numbers from 1 inclusive to x*power exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.

+ (x * (y + 1))

This counts the contribution of the x place for the numbers from x*power inclusive to n inclusive.

+ (x * Solution(power - 1))

This counts the contribution of the lower-order places for the numbers from 1 inclusive to x*power exclusive. This call is on a number one digit shorter than n.

+ Solution(y);

This counts the contribution of the lower-order places for the numbers from x*power inclusive to n inclusive. This call is on a number one digit shorter than n.

We get the time bound from applying Case 1 of the Master Theorem. To get the running time down to O(log n), we can compute Solution(power - 1) analytically. I don't remember offhand what the closed form is.



回答2:

After thinking for a while (and finding similar answers), I guess I could achieve the rationale that gave me another solution.

Definitions
Let S(n) be the sum of the digits of all numbers 0 <= k < n.
Let D(k) be the plain digits sum of k only.
(I'll omit parentheses for >clarity, so consider Dx = D(x)

If n>=10, let's decompose n by spliting the last digit and the tens (n = 10*k + r) (k, r being integers)

We need to sum S(n) = S(10*k + r) = S(10*k) + D(10*k+1) + ... + D(10*k+r)

The first part, S(10*k), follows a pattern:
S(10*1)=D1+D2+D3+...+D9 =(1+2+3+...+9) *1 + D10
S(10*2)=D1+D2+D3+...+D19 =(1+2+3+...+9) *2 +1*9 +D10 + D20
S(10*3)=D1+D2+D3+...+D29 =(1+2+3+...+9) *3 +1*9+2*9 +D10+...+D20 + D30

So S(10*k) = (1+2+3+...+9)*k + 9*S(k-1) + S(k-1) + D(10*k) = 45*k + 10*S(k-1) + D(10*k)

Regarding the last part, we know that D(10*k+x) = D(10*k)+D(x) = D(k)+x, so this last part can be simplified:

D(10*k+1) + ... + D(10*k+r) = D(k)+1 + D(k)+2 + ... D(k)+r = rD(k) + (1+2+...+r) = rD(k) + r*(1+r)/2

So, adding both parts of the equation (and grouping D(k)) we have:
S(n) = 45*k + 10*S(k-1) + (1+r)D(k) + r*(1+r)/2

And replacing k and r we have:
S(n) = 45*k + 10*S((n/10)-1) + (1+n%10)D(n/10) + n%10(1+n%10)/2

Pseudocode:

S(n):
  if n=0, sum=0
  if n<10, n*(1+n)/2
  r=n%10 # let's decompose n = 10*k + r (being k, r integers). 
  k=n/10
  return 45*k + 10*S((n/10)-1) + (1+n%10)*D(n/10) + n%10*(1+n%10)/2
D(n):
  just sum digits

First algorithm (the one from the original question) in C#

static BigInteger Solution(BigInteger n)
{
    if (n <= 0)
        return 0;
    if (n < 10)
        return (n * (n + 1)) / 2; // sum of arithmetic progression
    long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
    long y = long.Parse(n.ToString().Substring(1)); // remaining digits
    BigInteger power = BigInteger.Pow(10, n.ToString().Length - 1);
    var log = Math.Round(BigInteger.Log10(power)); // BigInteger.Log10 can give rounding errors like 2.99999

    return (power * Solution(x - 1)) //This counts the contribution of the x place for the numbers from 1 inclusive to x*power exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.
    + (x * (y + 1)) //This counts the contribution of the x place for the numbers from x*power inclusive to n inclusive.
    //+ (x * Solution(power - 1)) // This counts the contribution of the lower-order places for the numbers from 1 inclusive to x*power exclusive. This call is on a number one digit shorter than n.
    + (x * 45*new BigInteger(log)* BigInteger.Pow(10,(int)log-1)) // 
    + Solution(y);
}

Second algorithm (deduced from formula above) in C#

static BigInteger Solution2(BigInteger n)
{
    if (n <= 0)
        return 0;
    if (n < 10)
        return (n * (n + 1)) / 2; // sum of arithmetic progression
    BigInteger r = BigInteger.ModPow(n, 1, 10); // decompose n = 10*k + r
    BigInteger k = BigInteger.Divide(n, 10);
    return 45 * k
         + 10*Solution2(k-1)                      // 10*S((n/10)-1)
         + (1+r) * (k.ToString().ToCharArray().Select(x => int.Parse(x.ToString())).Sum()) //  (1+n%10)*D(n/10)
         + (r * (r + 1)) / 2;                     //n%10*(1+n%10)/2
}

EDIT: According to my tests, it's running faster than both the original version (which was using recursion twice), and the version modified to calculate Solution(power - 1) in a single step.

PS: I'm not sure, but I guess that if I had splitted the first digit of the number instead of the last, maybe I could achieve a solution like the original algorithm.