Python parallel execution with selenium

2019-05-06 10:26发布

I'm confused about parallel execution in python using selenium. There seems to be a few ways to go about it, but some seem out of date.

I'm wondering what is the latest way to do parallel execution using selenium?

There's a python module called python-wd-parallel which seems to have some functionality to do this, but it's from 2013, is this still useful now?

e.g. https://saucelabs.com/blog/parallel-testing-with-python-and-selenium-on-sauce-online-workshop-recap

We have concurrent.future also, this seems a lot newer, but not so easy to implement - anyone have a working example with parallel execution in selenium?

There's also using just threading and executors to get the job done, but I feel this will be slower, because it's not using all the cores and is still running in serial formation.

2条回答
戒情不戒烟
2楼-- · 2019-05-06 10:45

Use joblib's Parallel module to do that, its a great library for parallel execution.

Lets say we have a list of urls named urls and we want to take a screenshot of each one in parallel

First lets import the necessary libraries

from selenium import webdriver
from joblib import Parallel, delayed

Now lets define a function that takes a screenshot as base64

def take_screenshot(url):
    phantom = webdriver.PhantomJS('/path/to/phantomjs')
    phantom.get(url)
    screenshot = phantom.get_screenshot_as_base64()
    phantom.close()

    return screenshot

Now to execute that in parallel what you would do is

screenshots = Parallel(n_jobs=-1)(delayed(take_screenshot)(url) for url in urls)

When this line will finish executing, you will have in screenshots all of the data from all of the processes that ran.

Explanation about Parallel

  • Parallel(n_jobs=-1) means use all of the resources you can
  • delayed(function)(input) is joblib's way of creating the input for the function you are trying to run on parallel

More information can be found on the joblib docs

查看更多
做个烂人
3楼-- · 2019-05-06 10:45

I created a project to do this and it reuses webdriver instances for better performance:

https://github.com/testlabauto/local_selenium_pool

https://pypi.org/project/local-selenium-pool/

查看更多
登录 后发表回答