I am writing a function for some existing python code that will be passed a Mechanize browser object as a parameter.
I fill in some details in a form in the browser, and use response = browser.submit()
to move the browser to a new page, and collect some information from it.
Unfortunately, I occasionally get the following error:
httperror_seek_wrapper: HTTP Error 500: Internal Server Error
I've navigated to the page in my own browser, and sure enough, I occasionally see this error directly, so I think this is a server problem, not anything to do with robots.txt
, headers or similar.
The problem is that after submitting, the state of the browser
object changes and I can't continue to use it. My first thought was to try taking a deep copy first and use that if I ran into problems, but that gives the error TypeError: object.__new__(cStringIO.StringO) is not safe, use cStringIO.StringO.__new__()
as described here.
I've also tried using browser.back()
but get NoneType
errors.
Does anyone have a good solution to this?
Solution (with thanks to karnesJ.R below):
A great solution below uses the excellent requests
library (docs here). requests
has functionality to fill in a form and submit via post
or get
, which importantly doesn't change the state of the br
object.
An excellent website allows us to test various error codes, and has a form interface at the top that I've tested this on. I create a br
object at this site, then define a function that selects the form from br
, pulls out the relevant information, but does the submit via requests
- so that the br
object hasn't changed and is re-usable. Error codes cause requests
to return rubbish, but don't render the br
unusable.
As stated below, this involves a little more setup time, but is well worth it.
import mechanize
import requests
def testErrorCodes(br,theCodes):
for x in theCodes:
br.select_form(nr=0)
theAction = br.action
payload = {'code': x}
response = requests.post(theAction, data=payload)
print response.status_code
br=mechanize.Browser()
br.set_handle_robots(False)
response = br.open("http://savanttools.com/test-http-status-codes")
testErrorCodes(br,[401,402,403,404,500,503,504]) # Prints the error codes
testErrorCodes(br,[404]) # The browser is still alive and well to be used again!