Can I have two multithreaded functions running at

2019-08-09 12:38发布

I'm very new to multi-threading. I have 2 functions in my python script. One function enqueue_tasks iterates through a large list of small items and performs a task on each item which involves appending an item to a list (lets call it master_list). This I already have multi-threaded using futures.

executor = concurrent.futures.ThreadPoolExecutor(15) # Arbitrarily 15
futures = [executor.submit(enqueue_tasks, group) for group in grouper(key_list, 50)]
concurrent.futures.wait(futures)

I have another function process_master that iterates through the master_list above and checks the status of each item in the list, then does some operation.

Can I use the same method above to use multi-threading for process_master? Furthermore, can I have it running at the same time as enqueue_tasks? What are the implications of this? process_master is dependent on the list from enqueue_tasks, so will running them at the same time be a problem? Is there a way I can delay the second function a bit? (using time.sleep perhaps)?

1条回答
太酷不给撩
2楼-- · 2019-08-09 13:11

No, this isn't safe. If enqueue_tasks and process_master are running at the same time, you could potentially be adding items to master_list inside enqueue_tasks at the same time process_master is iterating over it. Changing the size of an iterable while you iterate over it causes undefined behavior in Python, and should always be avoided. You should use a threading.Lock to protect the code that adds items to master_list, as well as the code that iterates over master_list, to ensure they never run at the same time.

Better yet, use a Queue.Queue (queue.Queue in Python 3.x) instead of a list, which is a thread-safe data structure. Add items to the Queue in enqueue_tasks, and get items from the Queue in process_master. That way process_master can safely run a the same time as enqueue_tasks.

查看更多
登录 后发表回答