I’m writing a little private app to automatically log into my internet banking every day, and download the latest transactions. I’m writing it as a Django app, so I’m working in Python.
My internet banking doesn’t seem to work without JavaScript — I think it uses JavaScript to assign a session ID of some sort. Fetching the sign-in page via httplib
gives me a page telling me JavaScript’s required.
So, I’m now looking for libraries that fetch web pages, and execute the JavaScript on them. Pretty much headless browsers.
I’m fiddling about Selenium at the moment. I think it’ll do the job, although it is designed for testing web apps, so I was wondering if there was anything with similar capabilities designed for more general purposes than testing.
Any Python alternatives to Selenium for this sort of thing?
You can use Pywebkitgtk. There is a nice tutorial here.
Alternatively, you can use Beautiful Soup to get the page contents and something like python-spidermonkey to run the scripts.
You can also use Spynner, it allows programmatic web browsing.
Looks like QtWebKit is another option.
Since BeautifulSoup is no longer being actively developed I would recommend lxml since it does all the things that BeautifulSoup can do and a lot more.
I think a good match for your problem is Twill: a simple scripting language for Web browsing.
An other one to check is Windmill (a kind of Selenium but all written in Python).
since you use selenium I think you have already installed firefox. if so get an extension like firebug or tamper data and see what http-requests the javascript code will do while logging in.
if you have the url and the parameters needed you can easily program a python client with httplib or urllib2.
in firebug you find the requested urls under "NET". tamper data will be self descriptive. ;-)