I want to get costs of mobile phones from this site http://www.univercell.in/buy/SMART
i tried to test it so i used: scarpy shell http://www.univercell.in/control/AjaxCategoryDetail?productCategoryId=PRO-SMART&category_id=PRO-SMART&attrName=&min=&max=&sortSearchPrice=&VIEW_INDEX=2&VIEW_SIZE=15&serachupload=&sortupload=
But I am not able to connect to this site. As the page is loaded using ajax I found out the start_url using firebug. Can any one suggest me where I am going wrong
How about writing a JavaScript script to perform the actions that are already performed when clicking the page number and then simply dump the XML that is returned from the server. I mean try to make the calls to the server as if the site was hosted on your Desktop!
The JavaScript function called when you hit a number is
paginateList('numberOfPage')
wherenumberOfPage
is the page you want to visit.The body of the function is
Use these to get the data from each page recursively.
Hope it helps!
Here's your spider:
Save it to
spider.py
and run viascrapy runspider spider.py -o output.json
. Then inoutput.json
you will see:Hope that helps.