Can a website detect when you are using selenium w

2018-12-31 06:20发布

I've been testing out Selenium with Chromedriver and I noticed that some pages can detect that you're using Selenium even though there's no automation at all. Even when I'm just browsing manually just using chrome through Selenium and Xephyr I often get a page saying that suspicious activity was detected. I've checked my user agent, and my browser fingerprint, and they are all exactly identical to the normal chrome browser.

When I browse to these sites in normal chrome everything works fine, but the moment I use Selenium I'm detected.

In theory chromedriver and chrome should look literally exactly the same to any webserver, but somehow they can detect it.

If you want some testcode try out this:

from pyvirtualdisplay import Display
from selenium import webdriver

display = Display(visible=1, size=(1600, 902))
display.start()
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--disable-extensions')
chrome_options.add_argument('--profile-directory=Default')
chrome_options.add_argument("--incognito")
chrome_options.add_argument("--disable-plugins-discovery");
chrome_options.add_argument("--start-maximized")
driver = webdriver.Chrome(chrome_options=chrome_options)
driver.delete_all_cookies()
driver.set_window_size(800,800)
driver.set_window_position(0,0)
print 'arguments done'
driver.get('http://stubhub.com')

If you browse around stubhub you'll get redirected and 'blocked' within one or two requests. I've been investigating this and I can't figure out how they can tell that a user is using Selenium.

How do they do it?

EDIT UPDATE:

I installed the Selenium IDE plugin in Firefox and I got banned when I went to stubhub.com in the normal firefox browser with only the additional plugin.

EDIT:

When I use Fiddler to view the HTTP requests being sent back and forth I've noticed that the 'fake browser\'s' requests often have 'no-cache' in the response header.

EDIT:

results like this Is there a way to detect that I'm in a Selenium Webdriver page from Javascript suggest that there should be no way to detect when you are using a webdriver. But this evidence suggests otherwise.

EDIT:

The site uploads a fingerprint to their servers, but I checked and the fingerprint of selenium is identical to the fingerprint when using chrome.

EDIT:

This is one of the fingerprint payloads that they send to their servers

{"appName":"Netscape","platform":"Linuxx86_64","cookies":1,"syslang":"en-US","userlang":"en-US","cpu":"","productSub":"20030107","setTimeout":1,"setInterval":1,"plugins":{"0":"ChromePDFViewer","1":"ShockwaveFlash","2":"WidevineContentDecryptionModule","3":"NativeClient","4":"ChromePDFViewer"},"mimeTypes":{"0":"application/pdf","1":"ShockwaveFlashapplication/x-shockwave-flash","2":"FutureSplashPlayerapplication/futuresplash","3":"WidevineContentDecryptionModuleapplication/x-ppapi-widevine-cdm","4":"NativeClientExecutableapplication/x-nacl","5":"PortableNativeClientExecutableapplication/x-pnacl","6":"PortableDocumentFormatapplication/x-google-chrome-pdf"},"screen":{"width":1600,"height":900,"colorDepth":24},"fonts":{"0":"monospace","1":"DejaVuSerif","2":"Georgia","3":"DejaVuSans","4":"TrebuchetMS","5":"Verdana","6":"AndaleMono","7":"DejaVuSansMono","8":"LiberationMono","9":"NimbusMonoL","10":"CourierNew","11":"Courier"}}

Its identical in selenium and in chrome

EDIT:

VPNs work for a single use but get detected after I load the first page. Clearly some javascript is being run to detect Selenium.

14条回答
初与友歌
2楼-- · 2018-12-31 06:38

It sounds like they are behind a web application firewall. Take a look at modsecurity and owasp to see how those work. In reality, what you are asking is how to do bot detection evasion. That is not what selenium web driver is for. It is for testing your web application not hitting other web applications. It is possible, but basically, you'd have to look at what a WAF looks for in their rule set and specifically avoid it with selenium if you can. Even then, it might still not work because you don't know what WAF they are using. You did the right first step, that is faking the user agent. If that didn't work though, then a WAF is in place and you probably need to get more tricky.

Edit: Point taken from other answer. Make sure your user agent is actually being set correctly first. Maybe have it hit a local web server or sniff the traffic going out.

查看更多
裙下三千臣
3楼-- · 2018-12-31 06:39

Even if you are sending all the right data (e.g. Selenium doesn't show up as an extension, you have a reasonable resolution/bit-depth, &c), there are a number of services and tools which profile visitor behaviour to determine whether the actor is a user or an automated system.

For example, visiting a site then immediately going to perform some action by moving the mouse directly to the relevant button, in less than a second, is something no user would actually do.

It might also be useful as a debugging tool to use a site such as https://panopticlick.eff.org/ to check how unique your browser is; it'll also help you verify whether there are any specific parameters that indicate you're running in Selenium.

查看更多
余生请多指教
4楼-- · 2018-12-31 06:45

Write an html page with the following code. You will see that in the DOM selenium applies a webdriver attribute in the outerHTML

<html>
<head>
  <script type="text/javascript">
  <!--
    function showWindow(){
      javascript:(alert(document.documentElement.outerHTML));
    }
  //-->
  </script>
</head>
<body>
  <form>
    <input type="button" value="Show outerHTML" onclick="showWindow()">
  </form>
</body>
</html>

查看更多
ら面具成の殇う
5楼-- · 2018-12-31 06:46

The bot detection I've seen seems more sophisticated or at least different than what I've read through in the answers below.

EXPERIMENT 1:

  1. I open a browser and web page with Selenium from a Python console.
  2. The mouse is already at a specific location where I know a link will appear once the page loads. I never move the mouse.
  3. I press the left mouse button once (this is necessary to take focus from the console where Python is running to the browser).
  4. I press the left mouse button again (remember, cursor is above a given link).
  5. The link opens normally, as it should.

EXPERIMENT 2:

  1. As before, I open a browser and the web page with Selenium from a Python console.

  2. This time around, instead of clicking with the mouse, I use Selenium (in the Python console) to click the same element with a random offset.

  3. The link doesn't open, but I am taken to a sign up page.

IMPLICATIONS:

  • opening a web browser via Selenium doesn't preclude me from appearing human
  • moving the mouse like a human is not necessary to be classified as human
  • clicking something via Selenium with an offset still raises the alarm

Seems mysterious, but I guess they can just determine whether an action originates from Selenium or not, while they don't care whether the browser itself was opened via Selenium or not. Or can they determine if the window has focus? Would be interesting to hear if anyone has any insights.

查看更多
情到深处是孤独
6楼-- · 2018-12-31 06:46

Some sites are detecting this:

function d() {
try {
    if (window.document.$cdc_asdjflasutopfhvcZLmcfl_.cache_)
        return !0
} catch (e) {}

try {
    //if (window.document.documentElement.getAttribute(decodeURIComponent("%77%65%62%64%72%69%76%65%72")))
    if (window.document.documentElement.getAttribute("webdriver"))
        return !0
} catch (e) {}

try {
    //if (decodeURIComponent("%5F%53%65%6C%65%6E%69%75%6D%5F%49%44%45%5F%52%65%63%6F%72%64%65%72") in window)
    if ("_Selenium_IDE_Recorder" in window)
        return !0
} catch (e) {}

try {
    //if (decodeURIComponent("%5F%5F%77%65%62%64%72%69%76%65%72%5F%73%63%72%69%70%74%5F%66%6E") in document)
    if ("__webdriver_script_fn" in document)
        return !0
} catch (e) {}
查看更多
谁念西风独自凉
7楼-- · 2018-12-31 06:47

Try to use selenium with a specific user profile of chrome, That way you can use it as specific user and define any thing you want, When doing so it will run as a 'real' user, look at chrome process with some process explorer and you'll see the difference with the tags.

For example:

username = os.getenv("USERNAME")
userProfile = "C:\\Users\\" + username + "\\AppData\\Local\\Google\\Chrome\\User Data\\Default"
options = webdriver.ChromeOptions()
options.add_argument("user-data-dir={}".format(userProfile))
# add here any tag you want.
options.add_experimental_option("excludeSwitches", ["ignore-certificate-errors", "safebrowsing-disable-download-protection", "safebrowsing-disable-auto-update", "disable-client-side-phishing-detection"])
chromedriver = "C:\Python27\chromedriver\chromedriver.exe"
os.environ["webdriver.chrome.driver"] = chromedriver
browser = webdriver.Chrome(executable_path=chromedriver, chrome_options=options)

chrome tag list here

查看更多
登录 后发表回答