Let's take a 5x5 tic-tac-toe as an example. Let's say it's my AI's turn. Then,
- I make 25 moves (at each cell basically, of course, if it's a legal move),
- create a thread for each move (25 threads total (at most)),
- call a minimax function on each made move,
- then when all results come from each thread,
- compare the scores and choose the move with the best score.
Here are my questions:
Is it efficient to use 25 threads? What does using 25 threads mean?
Is it 25 times faster (most likely not)? What it depends on? On the computer, of course, but how can I know how many threads are okay to use based on the computer's resources?
- What happens if I use too many threads (nothing I guess...)?
Is my idea good? Thanks.
Consider the symmetry of your problem. There are actually only a very limited number of "unique" initial moves - the rest are the same but for reflection or rotation (therefore of identical strategic value). The unique moves for a 5x5 board are:
Or just 6 initial moves. Bam - you just reduced the complexity by >4x with no threads.
As others said, more threads than you have cores usually doesn't help in speeding up unless individual threads spend time "waiting" - for inputs, memory access, other results. It might be that six threads would be a good place to start.
Just to convince you about the symmetry, I am marking equivalent positions with the same number- see if you agree
This is the same when you rotate by any multiple of 90 degrees, or reflect about diagonal or left-right, up-down.
PS I realize this is not really answering the question you asked, but I believe it very directly addresses your underlying question - "how do I efficiently solve 5x5 tic-tac-toe exhaustively". As such I won't be upset if you select a different answer but I do hope you will take my advice to heart.
For a typical compute-bound application, a good rule of thumb is to use as many threads as you have hardware cores (or hyperthreads). Using more threads than cores won't make your application go faster. Instead, it will cause your application to use more memory than is necessary. Each thread typically has a 0.5 to 1Mbyte stack ... depending on your hardware and the Java version. If you create too many threads, the extra memory usage will result in a significant performance hit; i.e. more threads => slower program!
Another thing to consider is that Java threads are expensive to create on a typical JVM. So unless a thread does enough work (in its lifetime) there is a risk that you spend more time creating threads than you gain by using multiple cores in the computation.
Finally, you may find that the work does not spread evenly over all threads, depending on your minmax algorithm ... and the game state.
If I was trying to implement this, I'd start by implementing it as a single threaded application, and then:
If and only if it needs to go faster, I would then examine the code and (if necessary) add some monitoring to see how to break the computation down into large enough chunks to be executed in parallel.
Finally, I'd use those results to design and implement a multi-threaded version.
I'd also look at alternatives ... like using Java 7 fork/join instead of threads.
To answer your direct questions:
Probably not. It would only be efficient if you had that many cores (unlikely!). And even then you are only going to get a good speedup from using lots of threads if you gain more by running things in parallel than you lose due to thread-related overheads. (In other words, it depends how effectively you use those threads.)
I assume that you mean that you have created and started 25 Threads, either explicitly or using some existing thread pool implementation.
But the bottom line is that if you have (say) 4 cores, then at most 4 of those 25 threads can be executing at one time. The other threads will be waiting ...
The primary factor that limits performance is the number of cores. See above.
Too many threads means you use more memory and that makes your application run slower because of memory bandwidth competition, competition for physical memory pages, extra garbage collection. These factors are application and platform dependent, and hard to quantify; i.e. predict or measure.
Depending on the nature of your application (i.e. precisely how you implement the algorithms) too many threads could result in extra lock contention and thread context switching. That will also make your application slower.
It is a impossible to predict what would happen without seeing your actual code. But the number of cores gives you a theoretical upper bound on how much speedup is possible. If you have 4 cores, then you cannot get more than a 4-fold speedup with multi-threading.
You should only use a number of threads equal to the number of cores the machine has. Scheduling tasks onto those threads is a different thing.
So, the threading answers given are ok, but it seemed to me they overlooked the alpha-beta pruning feature of minimax search.
If you launch a thread for each "next move" from your current position, then having those thread talk to each other is slow and painful to write correctly. But, if they cant talk to each other, then you don't get the depth boosting that comes from alpha-beta pruning, until one level further down.
This will act against the efficiency of the result.
For general cases of improve computation time, the best case tends to be 1 thread per core, with either a simple assignment of tasks to thread if they are all similar time (eg matrix multiplication), or having a "set" of tasks, with each thread grabbing the next un-started one whenever it finishes its current task. (this has some locking tasks, but if they are small compared to resolution cost it is very effective).
So, for a 4 core system, and ~25 natural tasks you can hope for a speed up in the range of 3.5-4x. (you would do 4 in parallel ~5 times, then finish messily). But, in the minimax case you have lost the alpha-beta pruning aspect, which I understand is estimated to reduce "effective breadth" from N to about sqrt(N). For ~25 case, that means a effective branching factor of 5. this means using 4 cores and skipping pruning for the first level might actually hurt you.
So, where does the leave us?
As all my friends said that use as many threads as your machine has capacity.
but by adding them to you should go with improving algorithm as well.
for example in 5x5 tic tac toe both will get 12 or 13 moves. so number of posible moves are as nCr(combination equation) base 25C12 = 5,200,300. so now you have decrease number of thread now going to best selection you have best way to find best solution are only 12 (wining position) & 12 to lose worst condition all other are draw condition. so now what you can do is simply check those 12 condition from threads & leave extra combination with calculation that you need to create 12! * 12 no of threads which is very low compare to 25!.
hence your number of thread is going to decrease you can further think on it to decrease your number of thread.
when as your moves goes more & more you can go with alpha-beta pruning so that you can improve your algorithm as well.
If you are using Threads then to prevent memory wastage just use them for first calls of mini-max and then combine the result of the thread to get the output. It is a wastage if you use 25 threads or something number so big because the available cores are way less than that so what you can do is schedule only no of threads equivalent to available cores at a time on different states and combine all the results at the end.
Here is pseudo code:-