Capturing traffic in Selenium

2019-07-21 11:18发布

I am capturing network traffic on Selenium objects on the HTTP post requests I am making. Although JSON string returned has the request headers, the body(params) of the post message is never captured.

Heres my code,

host = "localhost"
port = "4444"
browser = r"*pifirefox"
sel = selenium(host, port, browser, url)
.
.
.....Submit action
postRequest = sel.captureNetworkTraffic('json')

postRequest has,

[{
  "statusCode":200,
  "method":"POST",
  "url":"http://.................",
  "bytes":97567,
  "start":"2011-12-02T17:42:04.719-0500",
  "end":"2011-12-02T17:42:05.044-0500",
  "timeInMillis":325,
  "requestHeaders":[{
      "name":"Host",
      "value":"......................."
    },{
      "name":"User-Agent",
      "value":"Mozilla/5.0 (Windows NT 6.1; WOW64; rv:7.0.1) Gecko/20100101 Firefox/7.0.1"
    },{
      "name":"Accept",
      "value":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8"
    },{
      "name":"Accept-Language",
      "value":"en-us,en;q=0.5"
    },{
      "name":"Accept-Encoding",
      "value":"gzip, deflate"
    },{
      "name":"Accept-Charset",
      "value":"ISO-8859-1,utf-8;q=0.7,*;q=0.7"
    },{
      "name":"Proxy-Connection",
      "value":"keep-alive"
    },{
      "name":"Referer",
      "value":"...................."
    },{
      "name":"Cookie",
      "value":"...................."
    },{
      "name":"X-Requested-With",
      "value":"XMLHttpRequest"
    },{
      "name":"X-MicrosoftAjax",
      "value":"Delta=true"
    },{
      "name":"Cache-Control",
      "value":"no-cache, no-cache"
    },{
      "name":"Content-Type",
      "value":"application/x-www-form-urlencoded; charset=utf-8"
    },{
      "name":"Content-Length",
      "value":"10734"
    },{
      "name":"Pragma",
      "value":"no-cache"
  }],
  "responseHeaders":[{
      "name":"Date",
      "value":"Fri, 02 Dec 2011 22:42:05 GMT"
    },{
      "name":"Server",
      "value":"Microsoft-IIS/6.0"
    },{
      "name":"Cache-Control",
      "value":"private"
    },{
      "name":"Content-Type",
      "value":"text/plain; charset=utf-8"
    },{
      "name":"Content-Length",
      "value":"97567"
    },{
      "name":"X-Powered-By",
      "value":"ASP.NET"
    },{
      "name":"Via",
      "value":"1.1 (jetty)"
    },{
      "name":"X-AspNet-Version",
      "value":"4.0.30319"
  }]
}]

I am trying to imitate the Post request, but without the body(params), its incomplete. Any suggestions would be greatly appreciated.

Cheers, A

3条回答
我欲成王,谁敢阻挡
2楼-- · 2019-07-21 11:57

When selenium gives you the request headers, they have the information to craft a PyCurl or urllib request that fetches the response bodies.

For me this was as easy as running the following regex to rip out the urls, and then using curl to fetch them.

urls = re.finditer('\n  "url":"(.*)",', sel.captureNetworkTraffic('json'))

a regex was used since some of the responses had embedded json causing json.loads to blow up :(. There is some additional effort if the parameters are all in the response header instead of the url.

查看更多
Melony?
3楼-- · 2019-07-21 11:59

Note: I may need more information about what you are trying to accomplish and why you chose selenium.

The only thing I can think of is piping the output of tshark or something similar into your python program. I suppose there are also pcap readers, but I have no experience with these. I briefly searched for a python network monitoring api, but had no such luck.

查看更多
淡お忘
4楼-- · 2019-07-21 12:02

You can't get the request or response body with Selenium - it only captures headers. Try Fiddler2 if you're running on Windows.

查看更多
登录 后发表回答