I have got this problem:
- Find the first element in a list, for which a given condition holds.
Unfortunately, the list is quite long (100.000 elements), and evaluation the condition for each element takes in total about 30 seconds using one single Thread.
Is there a way to cleanly parallelize this problem? I have looked through all the tbb patterns, but could not find any fitting.
UPDATE: for performance reason, I want to stop as early as possible when an item is found and stop processing the rest of the list. That's why I believe I cannot use parallel_while
or parallel_do
.
I'm not too familiar with libraries for this, but just thinking aloud, could you not have a group of threads iterating at different at the same stride from different staring points?
Say you decide to have
n
threads (= number of cores or whatever), each thread should be given a specific starting point up ton
, so the first thread starts onbegin()
, the next item it compares isbegin() + n
, etc. etc. second thread starts onbegin()+1
and then it's next comparison is inn
too etc.This way you can have a group of threads iterating in parallel through the list, the iteration itself is presumably not expensive - just the comparison. No node will be compared more than once and you can have some condition which is set when a match is made by any of the threads and all should check this condition before iterating/comparing..
I think it's pretty straightforward to implement(?)
If it's a linked list, A parallel search isn't going to add much speed. However, linked lists tend to perform poorly with caches. You may get a tiny performance increase if you have two threads: one does the find_first_element, and one simply iterates through the list, making sure not to get more than X (100?) ahead of the first thread. The second thread doesn't do any comparisons, but will assure that the items are cached as well as possible for the first thread. This may help your time, or it might make little difference, or it might hinder. Test everything.
If you are using GCC, GNU OpenMP provides parallel std functions link
you can take a look at http://gcc.gnu.org/onlinedocs/libstdc++/manual/parallel_mode.html for parallel algorithms implementations. And in particular you need find_if algorithm http://www.cplusplus.com/reference/algorithm/find_if/
Can't you transform the list to a balanced tree or similar? Such data structures are easier to process in parallel - usually you get back the overhead you may have paid in making it balanced in the first time... For example, if you write functional-style code, check this paper: Balanced trees inhabiting functional parallel programming
I see two opportunities for parallelism here: evaluating one element on multiple threads, or evaluating multiple elements at once on different threads.
There isn't enough information to determine the difficulty nor the effectiveness of evaluating one element on multiple threads. If this is easy, the 30 second per element time could be reduced.
I do not see a clean fit into TBB for this problem. There are issues with lists not having random access iterators, determining when to stop, and guaranteeing the first element is found. There may be some games you can play with the ranges to get it to work though.
You could use some lower level thread constructs to implement this yourself as well, but there are a number of places for incorrect results to be returned. To prevent such errors, I would recommend using an existing algorithm. You could convert the list to an array (or some other structure with random access iterators) and use the experimental libstdc++ Parellel Mode find_if algorithm user383522 referenced.