prevent external content to be loaded in selenium

2019-01-18 01:03发布

问题:

The question:

Is is possible to tell a browser that is controlled by selenium webdriver to not load any content from external sources, or alternatively, not load resources from a given list of domains?

Backround:

I have a webpage against which I write a java based test script with selenium webdriver - I can't change the page, I just have to write the tests. There are issues with some external content that the site loads from another domain. The external stuff is some javascript code that is actually not needed for my tests, but that the page in question includes. Now the problem. Sometimes the external sources are super slow, preventing the the webdriver to load the page within the given page load timeout (20 sec). My tests actually would run fine, because the page is in fact loaded - all html is there, all internal scripts are loaded and would work.

Random thoughts about this:

There are extensions for different browsers that would do what I ask, but I need to run my tests with several browsers, namely chrome, firefox and phantomjs. And there is no such thing like phantomjs extensions. I need a solution that is purely based on the webdriver technology if possible. I am willing to program a separate solution for each browser, though.

I appreciate any idea about how to address this.

回答1:

Solution is to use proxy. Webdriver integrates very well with browsermob proxy: http://bmp.lightbody.net/

private WebDriver initializeDriver() throws Exception {
    // Start the server and get the selenium proxy object
    ProxyServer server = new ProxyServer(proxy_port);  // package net.lightbody.bmp.proxy

    server.start();
    server.setCaptureHeaders(true);
    // Blacklist google analytics
    server.blacklistRequests("https?://.*\\.google-analytics\\.com/.*", 410);
    // Or whitelist what you need
    server.whitelistRequests("https?://*.*.yoursite.com/.*. https://*.*.someOtherYourSite.*".split(","), 200);

    Proxy proxy = server.seleniumProxy(); // Proxy is package org.openqa.selenium.Proxy

    // configure it as a desired capability
    DesiredCapabilities capabilities = new DesiredCapabilities();
    capabilities.setCapability(CapabilityType.PROXY, proxy);

    // start the driver   ;
    Webdriver driver = new FirefoxDriver(capabilities);

    return driver;
}

EDIT: people are often asking for http status codes, you can easily retrive them using the proxy. Code can be something like this:

// create a new har with given label
public void setHar(String label) {
    server.newHar(label);
}

public void getHar() throws IOException {
    // FIXME : What should be done with the this data?
    Har har = server.getHar();
    if (har == null) return;
    File harFile = new File("C:\\localdev\\bla.har");
    har.writeTo(harFile);
    for (HarEntry entry : har.getLog().getEntries()) {
        // Check for any 4XX and 5XX HTTP status codes
        if ((String.valueOf(entry.getResponse().getStatus()).startsWith("4"))
                || (String.valueOf(entry.getResponse().getStatus()).startsWith("5"))) {
            log.warn(String.format("%s %d %s", entry.getRequest().getUrl(), entry.getResponse().getStatus(),
                    entry.getResponse().getStatusText()));
            //throw new UnsupportedOperationException("Not implemented");
        }
    }
}


回答2:

You can chain the proxy, there isn't much documentation out there about doing so:

http://www.nerdnuts.com/2014/10/browsermob-behind-a-corporate-proxy/

We were able to use browsermob behind a corporate proxy using the following code:

// start the proxy
server = new ProxyServer(9090);
server.start();

server.setCaptureContent(true);
server.setCaptureHeaders(true);
server.addHeader(“accept-encoding”, “”);//turn off gzip

// Configure proxy server to use our network proxy
server.setLocalHost(InetAddress.getByName(“127.0.0.1″));

/**
 * THIS IS THE MAJICK!
 **/
HashMap<String, String> options = new HashMap<String, String>();
options.put(“httpProxy”, “172.20.4.115:8080″);
server.setOptions(options);
server.autoBasicAuthorization(“172.20.4.115″, “username”, “password”);

// get the Selenium proxy object
Proxy proxy = server.seleniumProxy();
DesiredCapabilities capabilities = DesiredCapabilities.phantomjs();
capabilities.setCapability(CapabilityType.PROXY, proxy);