Multiprocessing and Selenium Python

2019-06-17 10:14发布

I have 3 drivers (Firefox browsers) and I want them to do something in a list of websites.

I have a worker defined as:

def worker(browser, queue):
    while True:
        id_ = queue.get(True)
        obj = ReviewID(id_)
        obj.search(browser)
        if obj.exists(browser):
            print(obj.get_url(browser))
        else:
            print("Nothing")

So the worker will just acces to a queue that contains the ids and use the browser to do something.

I want to have a pool of workers so that as soon as a worker has finished using the browser to do something on the website defined by id_, then it immediately starts to work using the same browser to do something on the next id_ found in queue. I have then this:

pool = Pool(processes=3)  # I want to have 3 drivers
manager = Manager()
queue = manager.Queue()
# Define here my workers in the pool
for id_ in ids:
    queue.put(id_)
for i in range(3):
    queue.put(None)

Here I have a problem, I don't know how to define my workers so that they are in the pool. To each driver I need to assign a worker, and all the workers share the same queue of ids. Is this possible? How can I do it?

Another idea that I have is to create a queue of browsers so that if a driver is doing nothing, it is taken by a worker, along with an id_ from the queue in order to perform a new process. But I'm completely new to multiprocessing and actually don't know how to write this.

I appreciate your help.

1条回答
\"骚年 ilove
2楼-- · 2019-06-17 10:54

You could try instantiating the browser in the worker:

def worker(queue):
    browser = webdriver.Chrome()
    try:
        while True:
            id_ = queue.get(True)
            obj = ReviewID(id_)
            obj.search(browser)
            if obj.exists(browser):
                print(obj.get_url(browser))
            else:
                print("Nothing")
    finally:
        brower.quit()
查看更多
登录 后发表回答