This question already has an answer here:
-
Sum of Digits till a number which is given as input
2 answers
Problem:
Find the sum of the digits of all the numbers from 1 to N (both ends included)
Time Complexity should be O(logN)
For N = 10 the sum is 1+2+3+4+5+6+7+8+9+(1+0) = 46
For N = 11 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) = 48
For N = 12 the sum is 1+2+3+4+5+6+7+8+9+(1+0)+(1+1) +(1+2)= 51
This recursive solution works like a charm, but I'd like to understand the rationale for reaching such a solution. I believe it's based on finite induction, but can someone show exactly how to solve this problem?
I've pasted (with minor modifications) the aforementioned solution:
static long Solution(long n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
long y = long.Parse(n.ToString().Substring(1)); // remaining digits
int power = (int)Math.Pow(10, n.ToString().Length - 1);
// how to reach this recursive solution?
return (power * Solution(x - 1))
+ (x * (y + 1))
+ (x * Solution(power - 1))
+ Solution(y);
}
Unit test (which is NOT O(logN)):
long count = 0;
for (int i=1; i<=N; i++)
{
foreach (var c in i.ToString().ToCharArray())
count += int.Parse(c.ToString());
}
Or:
Enumerable.Range(1, N).SelectMany(
n => n.ToString().ToCharArray().Select(
c => int.Parse(c.ToString())
)
).Sum();
This is actually a O(n^log10(2))
-time solution (log10(2)
is approximately 0.3
). Not sure if that matters. We have n = xy
, where I use concatenation to denote concatenation, not multiplication. Here are the four key lines with commentary underneath.
return (power * Solution(x - 1))
This counts the contribution of the x
place for the numbers from 1
inclusive to x*power
exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.
+ (x * (y + 1))
This counts the contribution of the x
place for the numbers from x*power
inclusive to n
inclusive.
+ (x * Solution(power - 1))
This counts the contribution of the lower-order places for the numbers from 1
inclusive to x*power
exclusive. This call is on a number one digit shorter than n
.
+ Solution(y);
This counts the contribution of the lower-order places for the numbers from x*power
inclusive to n
inclusive. This call is on a number one digit shorter than n
.
We get the time bound from applying Case 1 of the Master Theorem. To get the running time down to O(log n)
, we can compute Solution(power - 1)
analytically. I don't remember offhand what the closed form is.
After thinking for a while (and finding similar answers), I guess I could achieve the rationale that gave me another solution.
Definitions
Let S(n) be the sum of the digits of all numbers 0 <= k < n.
Let D(k) be the plain digits sum of k only.
(I'll omit parentheses for >clarity, so consider Dx = D(x)
If n>=10, let's decompose n by spliting the last digit and the tens (n = 10*k + r) (k, r being integers)
We need to sum S(n) = S(10*k + r) = S(10*k) + D(10*k+1) + ... + D(10*k+r)
The first part, S(10*k), follows a pattern:
S(10*1)=D1+D2+D3+...+D9 =(1+2+3+...+9) *1 + D10
S(10*2)=D1+D2+D3+...+D19 =(1+2+3+...+9) *2 +1*9 +D10 + D20
S(10*3)=D1+D2+D3+...+D29 =(1+2+3+...+9) *3 +1*9+2*9 +D10+...+D20 + D30
So S(10*k) = (1+2+3+...+9)*k + 9*S(k-1) + S(k-1) + D(10*k) = 45*k + 10*S(k-1) + D(10*k)
Regarding the last part, we know that D(10*k+x) = D(10*k)+D(x) = D(k)+x, so this last part can be simplified:
D(10*k+1) + ... + D(10*k+r) = D(k)+1 + D(k)+2 + ... D(k)+r = rD(k) + (1+2+...+r) = rD(k) + r*(1+r)/2
So, adding both parts of the equation (and grouping D(k)) we have:
S(n) = 45*k + 10*S(k-1) + (1+r)D(k) + r*(1+r)/2
And replacing k and r we have:
S(n) = 45*k + 10*S((n/10)-1) + (1+n%10)D(n/10) + n%10(1+n%10)/2
Pseudocode:
S(n):
if n=0, sum=0
if n<10, n*(1+n)/2
r=n%10 # let's decompose n = 10*k + r (being k, r integers).
k=n/10
return 45*k + 10*S((n/10)-1) + (1+n%10)*D(n/10) + n%10*(1+n%10)/2
D(n):
just sum digits
First algorithm (the one from the original question) in C#
static BigInteger Solution(BigInteger n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
long x = long.Parse(n.ToString().Substring(0, 1)); // first digit
long y = long.Parse(n.ToString().Substring(1)); // remaining digits
BigInteger power = BigInteger.Pow(10, n.ToString().Length - 1);
var log = Math.Round(BigInteger.Log10(power)); // BigInteger.Log10 can give rounding errors like 2.99999
return (power * Solution(x - 1)) //This counts the contribution of the x place for the numbers from 1 inclusive to x*power exclusive. This recursive call doesn't contribute to the complexity because it returns in constant time.
+ (x * (y + 1)) //This counts the contribution of the x place for the numbers from x*power inclusive to n inclusive.
//+ (x * Solution(power - 1)) // This counts the contribution of the lower-order places for the numbers from 1 inclusive to x*power exclusive. This call is on a number one digit shorter than n.
+ (x * 45*new BigInteger(log)* BigInteger.Pow(10,(int)log-1)) //
+ Solution(y);
}
Second algorithm (deduced from formula above) in C#
static BigInteger Solution2(BigInteger n)
{
if (n <= 0)
return 0;
if (n < 10)
return (n * (n + 1)) / 2; // sum of arithmetic progression
BigInteger r = BigInteger.ModPow(n, 1, 10); // decompose n = 10*k + r
BigInteger k = BigInteger.Divide(n, 10);
return 45 * k
+ 10*Solution2(k-1) // 10*S((n/10)-1)
+ (1+r) * (k.ToString().ToCharArray().Select(x => int.Parse(x.ToString())).Sum()) // (1+n%10)*D(n/10)
+ (r * (r + 1)) / 2; //n%10*(1+n%10)/2
}
EDIT: According to my tests, it's running faster than both the original version (which was using recursion twice), and the version modified to calculate Solution(power - 1) in a single step.
PS: I'm not sure, but I guess that if I had splitted the first digit of the number instead of the last, maybe I could achieve a solution like the original algorithm.