Let's define eleven-non-free
numbers:
If we consider a number as a string, then if any substring inside is a (non-zero) power of 11
, then this number is an eleven-non-free
number.
For example, 1123
is an eleven-non-free
number as 11
inside is 11^1
. Also 12154
is one as 121
is 11^2
. But 12345
is not, because we can't find any non-zero power of 11
inside.
So given a k, find the kth eleven-non-free
number. For example the 13th such number is 211
.
I don't know how to efficiently do it. the brute-force way is to increase i from 1 and check every number and count until the kth.
I guess we should consider strings with different length (1, 2, 3, 4, ...). then for each length, we try to fill in 11, 11^2, 11^3, etc and try to get all the combinations.
But it seems quite complicated as well.
Anyone?
It's interesting how easily people get scared by the term "brute force" without even trying to quantify it :)
Brute force solution is actually O(R * log R)
where R is the result (
k
-th eleven-non-free number). You just test all numbers 1..R and for each of them you perform string searches for 11, 121, 1331, 14641 etc. using prepared automaton (for given number of digits).O(R * log R)
doesn't look that bad, if you look at it this way, does it? :-)Generation solution?
Clever idea is to try to generate the number more directly:
k
-th number;k
eleven-non-free numbers;Solution 1 would have to be very clever, if not close to impossible. Solution 2 seems better than brute force solution because it skips those numbers which are not eleven-non-free. Solution 3 would be an easily implemented alternative, which brings us to important question:
How many eleven-non-free numbers < 10n exist?
Easy combinatorics (works exactly for n <= 10; for higher n it is close approximation, neglecting that 112 is substring of 1113 and 11 is substring of 1111 and 1119):
so basicly for the limit 10n you get more than 10n-2 solutions! This means that number of solutions is a constant proportion of the limit, which means that
O(R) = O(k)
which implies that
The brute force solution is actually O(k * log k), as well as the generate-all solution!
The dreaded brute force solution kooks much much better now, doesn't it? Yet it's still the same :)
Can we get better than this?
O(k)
but you would have to be very careful to achieve this. There will be many complications that will make it uneasy to select next smallest eleven-non-free-number. E.g. when generating all 11xxx, 121xx, 1331x, eg. 13121 falls in between, making any automatic generation of ordered numbers difficult, let alone duplicities caused by pattern appearing in xxx digits etc. You will probably end up with some complicated data structure that will forceO(log k)
in each step, and thusO(k log k)
in total.O(k log k)
, which is the same as to findk
-th number.This sounds like a divide and conquer algorithm to me.
This problem feels very similar to: http://www.geeksforgeeks.org/divide-and-conquer-maximum-sum-subarray/
and this: http://en.wikipedia.org/wiki/Maximum_subarray_problem
When you divide you split the original String s into 2 parts that are roughly 1/2 as big. At the bottom of your recursion you have easy problems with small 1 char Strings that are all eleven-free (because they don't contain enough chars to have any powers of 11 inside).
The "trick" comes in when you "step up" in the recursion. In this step you need to take advantage of the fact that any "power" (i.e. 121) you are looking for MUST bridge the "middle" of the input string.
For example, when you "step up" from Strings of length 1 to Strings of length 2 you'll know that some of the power you're looking for is in the first half, and some is in the second half of the input string.
given a k, find the kth eleven-non-free number.
... is a big question.Since this is time taking question, I am writing pseudo code.
var eleven_powers = []; for (var i=0; i<1000; i++) eleven_powers.push(Math.pow(11,i));
non-eleven-free
numbers to an extent you can. This will have to deal with big integers.for (var i=11; i<Integer.MAX_VALUE; i++){ if(string(i).substring(for each of eleven_powers) == true) store it sequentially results_array }
Take k, return results_array[k]
Edit: This approach misses the eleven-non-free number 1011 for example. I know what's wrong and will fix it, but can't do it now.
Let's take a principled approach that builds up to the solution rather than trying to handle it all at once.
Let S be the sequence of powers of 11:
11, 121, 1331, 14641, 161051, ...
Let E(n) be the set of numbers that contain the string representation of n as a substring. The set you are looking for is the union of E(n) over n in S.
How do we generate the elements of E(n) for some n? (Let's take n = 121 for example.)
Make a directed acyclic graph where n is the root node. Make n point to all 19 of the following numbers:
Append 0-9: n0 n1 n2 n3 n4 n5 n6 n7 n8 n9
Prepend 1-9: 1n 2n 3n 4n 5n 6n 7n 8n 9n
Repeat the process for each of these new nodes. Notice that certain nodes may end up with more than one parent, so you'll want to maintain a hashmap from number to node pointer. (You can get to 71217 by 121 -> 1217 -> 71217 or 121 -> 7121 -> 71217.)
This graph has the property that, for any edge from u to v, we have u < v. This is important because it means we can generate the elements of E(n) in increasing order by doing a breadth-first generation of the graph, always expanding the unexpanded node with smallest number.
Now merge these graphs for all n in S.
You have one large directed acyclic graph (that is infinite) that you can generate in a breadth-first manner, always expanding the unexpanded node with smallest number, emitting the next (kth) eleven-non-free number.
Pseudocode
It works, I tested it. Here's a C# gist: https://gist.github.com/timothy-shields/8026924
To get the kth eleven-non-free number, simply do something like generate_eleven_non_free().element_at(k), where this is supposed to pick out the kth yielded number.
OK, this took several days to figure out. The first important thing to understand is that the only concrete requirement of the euler puzzle 442 that this question stems from is to solve for E(10^18). The unrelated examples of what an 11-free number means are just distractions.
The brute force option is out of the question, for 2 reasons.
It requires literally a quintilian string comparisons and would take a week to resolve on most home computers.
Leonhard Euler was a mathematician so the spirit of the question has to be interpreted in that context.
After a while, I started focusing on pattern matching for powers of 10, instead of continually including powers of 11 in my algorithms. After you calculate what the powers of 11 are, you're done with anything related to that. They only represent strings. They aren't even included in the final algorithm, they are just used in the detective work.
I started trying to get a detailed understanding of how new 11-containing numbers are introduced between powers of 10 and if there was a predictable pattern. If there was, then it would just be a matter of using that pattern to deduce all of the 11-containing numbers introduced prior to any 10^ stopping point.
Understanding the Main Concepts
For each new 10^ reached, the number of previous 11^ string matches are identical to the previous 10^ range. It's easier to explain with a chart.
Here is a list of 10^2 - 10^6 that shows the count of new 11-containing numbers introduced so far and grouped by the 11^ number that matched the string.
Fig 1.) Number of 11^ matches in 10^ ranges grouped by 11^ string match
There are two requirements for making this list.
When multiple 11-containing numbers are match sequentially (like 11000 - 11999) all of those items must be attributed to the 11^ group of the first match. Meaning that even though there will be matches for 121 and 1331 in that example, all matches in that range are attributed to 11, because it is first.
Matches are made based on the shortest possibility first. i.e.) 11, 121, 1331, etc... This is because some numbers match multiple 11^ strings while others appear inside of a contiguous range. And, matching them from high to low doesn't produce a predictable pattern. Functionally, it's all the same. This just brings a whole lot of clarity into the picture.
Notice that in each new 10^ range, only 1 new 11^ match is introduced. And, the number of matches for each previous 11^ number is identical, regardless of the previous power. The difficulty here is that the number of *11 string matches can't be directly inferred from the previous power.
Fig 2.) Differentiating other 11^ patterns from the *11 match
All of the "pattern" matches are highly predictable within 10^ powers. Only the *11 numbers are not obvious in how they accumulate. It's not really that there isn't a pattern, the problem is that you would have to presume numbers that were scanned right to left and, as a result, none if the other patterns would be predictable anymore.
As it turns out, a separate accumulator has to be tracked concurrently that allows us to use the previous *11 matches to calculate the number of new *11 matches. For any of you math geniuses out there, there may be a more elegant solution. But, here is mine.
You have to always be separately keeping track of the number of *11 matches from the previous 2 powers. Using these as operands, you can calculate the number of *11 matches for the current power.
Example:
The number of *11 matches in the 10^3 range is 8. There are also 10 matches on "110" but they don't matter here, we are just tracking numbers that end with "11". So, we are starting with the 2 previous *11 match counts being 8 and 0 from the 2 previous 10^ ranges.
Important: For any number less than 10^2, 0 is the previous match count for this calculation. That is why 0 is the second look-back operand when working with 10^3.
Fig 3.) How to calculate *11 match counts per 10^ range using back-tracking
This number however is only useful by looking at it, again, from a different perspective.
Fig 4.) Illustration of how match groups scale by powers of 10
In these examples, only the *11 number has to be calculated separately. Seeing these numbers stacked like this sort of makes it feel more complicated than it is. It's only important to understand the concept. In practice, we abstract out everything except the *11 calculation through cumulative multiplication.
Now, before we get into a working algorithm, the last thing to understand conceptually is how to use these patterns to calculate the Nth 11-free number ending at each ^10.
This is a two step process so I'll demonstrate using an example. Lets look at the 1000 - 10000 range from Fig 1. 10000 is 10^4 so that is the range we are solving for.
The following 11^ groups are all within that range. The 18 and the 1 can be derived from the previous powers (see Fig 1).
Calculate the total 11-containing matches in the current 10^4 range. To do this, we first have to calculate the value for the "11" match category. The 260 value above can be calculated using numbers from both the previous ranges and the separate *11 tracking operands like so:
Calculating *11 count for 10^4
Combine the result of 10^4 with all the other results back to 10^2. Note: The total matches for 10^2 is 1 because there is only one 11-containing number from 0 through 100.
A Working Algorithm
Here is an algorithm written in C# that solves for each of 10^1 through 10^18. It returns almost instantly. Out of respect for project euler I didn't post the answer to their puzzle directly. The output is accurate though. Right now the #442 question only shows 163 people have solved the problem compared to 336260 people solving problem #1. If that number starts growing for no reason, I will delete this post.
Usage:
Here are the first 15.
As good starting point is to consider the related problem of counting how many eleven-non-free (ENF) integers there are of N digits. This is not exactly the same thing as finding the n'th ENF integer, but the latter problem can be reduced to the former pretty easily (see the end of this post for java code that does this).
The idea is to do dynamic programming on prefixes. For any string of digits
s
, and any integerk
, letN(k,s)
denote the number of ENF strings of lengthk
which start withs
. And, for any strings
, lets_0
be the longest suffix ofs
that also occurs as a prefix of any power of eleven (POE) of lengthk
or less. Then, assuming thats
itself does not contain any POE substrings, we have the equation:The reasoning behind this equation is as follows. Since
s_0
is a suffix ofs
, let us writes
as a concatenation of some other stringq
ands_0
:s=[q,s_0]
. Now, letr
be any digit string which starts withs
and again let us write this as a concatenation, asr=[s,r_0] = [q,s_0,r_0]
.If
r
contains a POE as a substring, then either this POE is contained inr_0
, or else it crosses over the boundary betweens
andr_0
. In this case,s
must end in a POE prefix, and, since we know that the longest POE prefix that occurs as a suffix ofs
iss_0
, it follows that ifr
contains a POE substring, then this substring must be contained in[s_0,r_0]
. This gives us a one-to-one correspondence between ENFs that begin withs=[q,s_0]
and ENFs that begin withs_0
.This above equation gives rise to a recursive algorithm to count the number of k-digit ENFs with a given prefix. The base cases end up being instances where
length(s_0)=length(s)
which means thats
itself is a prefix of a POE of length k or less. Since there aren't very many such POE prefixes (at most k choose 2 which is O(k^2)), this formula leads to an efficient algorithm. Here is pseudocode:Using dynamic programming, the running time of this algorithm will be polynomial in k, the number of digits. The algorithm only recurses on POE prefixes, and since there are O(k^2) of these, the running time will be O(k^2) times the running time of each iteration. Using a naive O(k^2) algorithm to find the longest suffix that matches a POE prefix would thus lead to an O(k^4) algorithm, while using a radix tree to find matches in linear time would result in O(k^3). The java code given below uses the naive algorithm, and experimentally, for values up to around k=100 seems to be Θ(k^4.5), though this discrepancy could well be due to implementation details or constant factors affecting runtimes for small input sizes. Here is the code:
As mentioned above, this algorithm does not solve the exact problem in the OP. But, modifying this code to actually find the n'th ENF number is pretty simple. Making repeated callse to
countENF
, we first figure out how many digits the n'th ENF number has, and then, one digit at a time, we figure out the digits of the n'th ENF from left to right.Here is some sample output, giving the N'th ENF as computed using dynamic programming, as well as using brute force, for powers of 10.