I\'m connecting to a web site, logging in.
The website redirects me to new pages and Mechanize deals with all cookie and redirection jobs, but, I can\'t get the last page. I used Firebug and did same job again and saw that there are two more pages I had to pass with Mechanize.
I took a quick look at the pages and saw that there is some JavaScript and HTML code but couldn\'t understand it because it doesn\'t look like normal page code. What are those pages for? How they can redirect to other pages? What should I do to pass these?
If you need to handle pages with Javascript, try WATIR or Selenium - those drive a real web browser, and can thus handle any Javascript. WATIR Classic requires either IE or Firefox with a certain extension installed, and you will see the pages flash on the screen as it works.
Your other option would be understanding what the Javascript on the offending page does and bypassing it manually, but that seems onerous.
At present, Mechanize doesn\'t handle JavaScript. There\'s talk of eventually merging Johnson\'s capabilities into Mechanize, but until that happens, you have two options:
- Figure out the JavaScript well enough to understand how to traverse those pages.
- Automate an actual browser that does understand JavaScript using Watir.
what are those pages for? how they can redirect to other pages. what should i do to pass these?
Sometimes work is done on those pages. Sometimes the JavaScript is there to prevent automated access like what you\'re trying to do :). A lot of websites have unnecessary checks to make sure you have a \"good\" browser, so make sure that your user_agent
is set to something common, like IE. Sometimes setting the user_agent
to look like an old browser will let you get past without JavaScript.
Website automation is fun because you have to outsmart the website and its software developers, using multiple strategies. Like the others said, Watir is the best tool for getting past JavaScript at the moment.