I've searched a lot for this, but was unable to find something I could grasp my head around. What I'm searching for is how to make this algorithm parallel. It doesn't matter the way its parallel, such as multithreaded or multiprocessing or even distributed computing, but how exactly would I split up the work between nodes or cores?
def bruteforce(charset, maxlength):
return (''.join(candidate)
for candidate in itertools.chain.from_iterable(itertools.product(charset, repeat=i)
for i in range(1, maxlength + 1)))
First thing you have to do is get disenchanted with 1-liner disease ;-) That is, parallel processing adds many complexities of its own, so you want to keep your code as simple and transparent as possible. Here's a way that generalizes @BasSwinckels's suggestion. It's not short! But it's very effective: it will nail your CPU meter to the wall no matter how many cores you have.
There are multiple pieces to this because what you want is "lumpy": strings of a variety of sizes. Parallel gimmicks are usually easier the more utterly uniform the structure of the problem. But lumpiness can be handled - it just takes more code.
Do note the
assert
statements, and the "useless" computations ofnum_done
. Parallel code adds brand new dimensions of complexity, so do yourself a favor and code defensively from the start. You're going to try many things that simply won't work - that happens to everyone.Note too that breaking free of 1-liner disease also gives a more efficient approach, even without multiple cores: computing and joining the
prefix
just once for the longer strings will save countless billions of redundant joins over the course of longer runs.Have fun :-)
If I understand your generator expression correctly, you are trying to generate all the words with a length from 1 ..
maxlength
given a certain alphabet.I don't know exactly how you want to split your problem (do you have
N
workers?), but one obvious way would be to split the list of words just by the first letter, hand those out to the various parallel workers, which then all have to append all possible combinations of words with 0 ..maxlength - 1
letters from the alphabet.