I am trying to fill the form of this website http://www.marutisuzuki.com/Maruti-Price.aspx.
It consists of three drop down lists. One is Model of the car, Second is the state and third is city. The first two are static and the third, city is generated dynamically depending upon the value of state, there is an onclick java script event running which gets the values of corresponding cities in a state.
I am familiar with mechanize module in python. I came across several links telling me that I cannot handle dynamic content in mechanize. But this link http://toddhayton.com/2014/12/08/form-handling-with-mechanize-and-beautifulsoup/ in the section "Adding item dynamically" states that I can use mechanize to handle dynamic content but I did not understand this line of code in it
item = Item(br.form.find_control(name='searchAuxCountryID'),{'contents': '3', 'value': '3', 'label': 3})
What is "Item" in this line of code corresponding to the city field in the form. I came across selenium module which might help me handling dynamic drop down list. But I was not able to find anything in its documentation or any good blog on how to use it.
Can some one suggest me how to submit this form for different models, states and cities? Any links on how to solve this problem will be appreciated. A sample code in python on how to submit the form will be helpful. Thanks in advance.
If you look at the request being sent to that site in developer tools, you'll see that a POST is sent as soon as you select a state. The response that is sent back has the form with the values in the city dropdown populated.
So, to replicate this in your script you want something like the following:
- Open the page
- Select the form
- Select values for model and state
- Submit the form
- Select the form from the response sent back
- Select value for city (it should be populated now)
- Submit the form
- Parse the response for the table of results
That will look something like:
#!/usr/bin/env python
import re
import mechanize
from bs4 import BeautifulSoup
def select_form(form):
return form.attrs.get('id', None) == 'form1'
def get_state_items(browser):
browser.select_form(predicate=select_form)
ctl = browser.form.find_control('ctl00$ContentPlaceHolder1$ddlState')
state_items = ctl.get_items()
return state_items[1:]
def get_city_items(browser):
browser.select_form(predicate=select_form)
ctl = browser.form.find_control('ctl00$ContentPlaceHolder1$ddlCity')
city_items = ctl.get_items()
return city_items[1:]
br = mechanize.Browser()
br.open('http://www.marutisuzuki.com/Maruti-Price.aspx')
br.select_form(predicate=select_form)
br.form['ctl00$ContentPlaceHolder1$ddlmodel'] = ['AK'] # model = Maruti Suzuki Alto K10
for state in get_state_items(br):
# 1 - Submit form for state.name to get cities for this state
br.select_form(predicate=select_form)
br.form['ctl00$ContentPlaceHolder1$ddlState'] = [ state.name ]
br.submit()
# 2 - Now the city dropdown is filled for state.name
for city in get_city_items(br):
br.select_form(predicate=select_form)
br.form['ctl00$ContentPlaceHolder1$ddlCity'] = [ city.name ]
br.submit()
s = BeautifulSoup(br.response().read())
t = s.find('table', id='ContentPlaceHolder1_dtDealer')
r = re.compile(r'^ContentPlaceHolder1_dtDealer_lblName_\d+$')
header_printed = False
for p in t.findAll('span', id=r):
tr = p.findParent('tr')
td = tr.findAll('td')
if header_printed is False:
str = '%s, %s' % (city.attrs['label'], state.attrs['label'])
print str
print '-' * len(str)
header_printed = True
print ' '.join(['%s' % x.text.strip() for x in td])
I had the same issue with the tutorial, and this worked for me:
item = mechanize.Item(br.form.find_control(name='searchAuxCountryID'),{'contents': '3', 'value': '3', 'label': 3})