Really need the help from this community.
My question is that when I used the code in python
response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract()
to extract the vendor name in scrapy shell, the output is empty. I really did not know why that happened, and it seems to me that the problem might be the website info is updating dynamically?
The url for this web scrapping is: https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860, and what I need is the Vendor name and Price for each vendor. Besides the attached pic is the screenshot of "the inspect". enter image description here
However, the similar code works to extract price in the following page url ('https://cruiseline.com/destination/caribbean/cruise/best?sort=rank,ship_status&&direction=desc&page=1&per_page=10&sailing_counts=0')
Prices = response.xpath(
"//div[contains(@class,'featured-cruise-price-inner-price')]/span/descendant::text()").extract()
Really appreciate the help!
I tried this url in scrapy shell:https://cruiseline.com/cruise/7-night-bahamas-florida-new-york-roundtrip-32860, and i also got nothing with
response.xpath("//div[contains(@class,'check-prices-widget-not-sponsored')]/a/div[contains(@class,'check-prices-widget-not-sponsored-link')]").extract()
Then I used view(response) command to figure out what the spider sees, and found out that the site is dynamic, which means if you want to scrape info on that website, you need to execute the js codes that show the info.
Here are the screenshots:
As you can see, the info you need doesn't show. However, this one https://cruiseline.com/destination/caribbean/cruise/best?sort=rank,ship_status&&direction=desc&page=1&per_page=10&sailing_counts=0 is static, so that's why you can scrape what you need.
I got two ways for you to scrape dynamic website(of course, there are more):
1.Splash(Official Doc): In your Spider, yield your url with SplashRequest instead of scrapy.Request.
2.Selenium + PhantomJS(Official Doc)
Good luck with your scraping! :)