This question already has an answer here:
Problem: Find the sum of the digits of all the numbers from 1 to N (both ends included)
Time Complexity should be O(logN)
For N = 10 the sum is 1+2+3+4+5+6+7+8+9+(1+0) = 46
For N = 11 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) = 48
For N = 12 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) +(1+2)= 51
This recursive solution works like a charm, but I'd like to understand the rationale for reaching such a solution. I believe it's based on finite induction, but can someone show exactly how to solve this problem?
I've pasted (with minor modifications) the aforementioned solution:
static long Solution(long n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
long y = long.Parse(n.ToString().Substring(1)); // remaining digits
int power = (int)Math.Pow(10, n.ToString().Length - 1);
// how to reach this recursive solution?
return (power * Solution(x - 1))
+ (x * (y + 1))
+ (x * Solution(power - 1))
+ Solution(y);
}
Unit test (which is NOT O(logN)):
long count = 0;
for (int i=1; i<=N; i++)
{
foreach (var c in i.ToString().ToCharArray())
count += int.Parse(c.ToString());
}
Or:
Enumerable.Range(1, N).SelectMany(
n => n.ToString().ToCharArray().Select(
c => int.Parse(c.ToString())
)
).Sum();
After thinking for a while (and finding similar answers), I guess I could achieve the rationale that gave me another solution.
Definitions
Let S(n) be the sum of the digits of all numbers 0 <= k < n.
Let D(k) be the plain digits sum of k only.
(I'll omit parentheses for >clarity, so consider Dx = D(x)
If n>=10, let's decompose n by spliting the last digit and the tens (n = 10*k + r) (k, r being integers)
We need to sum S(n) = S(10*k + r) = S(10*k) + D(10*k+1) + ... + D(10*k+r)
The first part, S(10*k), follows a pattern:
S(10*1)=D1+D2+D3+...+D9 =(1+2+3+...+9) *1 + D10
S(10*2)=D1+D2+D3+...+D19 =(1+2+3+...+9) *2 +1*9 +D10 + D20
S(10*3)=D1+D2+D3+...+D29 =(1+2+3+...+9) *3 +1*9+2*9 +D10+...+D20 + D30
So S(10*k) = (1+2+3+...+9)*k + 9*S(k-1) + S(k-1) + D(10*k) = 45*k + 10*S(k-1) + D(10*k)
Regarding the last part, we know that D(10*k+x) = D(10*k)+D(x) = D(k)+x, so this last part can be simplified:
D(10*k+1) + ... + D(10*k+r) = D(k)+1 + D(k)+2 + ... D(k)+r = rD(k) + (1+2+...+r) = rD(k) + r*(1+r)/2
So, adding both parts of the equation (and grouping D(k)) we have:
S(n) = 45*k + 10*S(k-1) + (1+r)D(k) + r*(1+r)/2
And replacing k and r we have:
S(n) = 45*k + 10*S((n/10)-1) + (1+n%10)D(n/10) + n%10(1+n%10)/2
Pseudocode:
First algorithm (the one from the original question) in C#
Second algorithm (deduced from formula above) in C#
EDIT: According to my tests, it's running faster than both the original version (which was using recursion twice), and the version modified to calculate Solution(power - 1) in a single step.
PS: I'm not sure, but I guess that if I had splitted the first digit of the number instead of the last, maybe I could achieve a solution like the original algorithm.
This is actually a
O(n^log10(2))
-time solution (log10(2)
is approximately0.3
). Not sure if that matters. We haven = xy
, where I use concatenation to denote concatenation, not multiplication. Here are the four key lines with commentary underneath.This counts the contribution of the
x
place for the numbers from1
inclusive tox*power
exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.This counts the contribution of the
x
place for the numbers fromx*power
inclusive ton
inclusive.This counts the contribution of the lower-order places for the numbers from
1
inclusive tox*power
exclusive. This call is on a number one digit shorter thann
.This counts the contribution of the lower-order places for the numbers from
x*power
inclusive ton
inclusive. This call is on a number one digit shorter thann
.We get the time bound from applying Case 1 of the Master Theorem. To get the running time down to
O(log n)
, we can computeSolution(power - 1)
analytically. I don't remember offhand what the closed form is.