I wish to scrape reviews from this url: https://seedly.sg/reviews/p2p-lending/funding-societies by using selenium.
For my '''Automation of getting to the next page''' code (clicking for the next page),ElementClickInterceptedException and NoSuchElementException have continuously been thrown even though the element exist, the xpath is correct and it has even run successfully for several times.
I added sleeps intentionally but this is still not working. May I know how should I solve this?
Thanks in advance.
##These are basic setups
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
from time import sleep
import pandas as pd
'''Create new instance of Chrome in Incognito mode'''
##Adding the incognito argument to our webdriver
option = webdriver.ChromeOptions()
option.add_argument(" — incognito")
##create a new instance of Chrome
browser = webdriver.Chrome('/Users/w97802/chromedriver')
'''Scrape Basic Info'''
from parsel import Selector
url = 'https://seedly.sg/reviews/p2p-lending/funding-societies'
browser.get(url)
selector = Selector(text=browser.page_source)
####################################################################
##This is the code to get reviews
reviews_list =[]
'''Loop all pages'''
for i in range(0,16):
sleep(2)
'''Automation of clicking all more'''
test = browser.find_elements_by_xpath('//a[contains(@class,"sc-1rz2iis-2 xgYML")]')
for x in range(0,len(test)):
sleep(9)
more = browser.find_element_by_xpath('//a[contains(@class,"sc-1rz2iis-2 xgYML")]')
more.click()
sleep(9)
print("clicking more in another page")
'''Getting reviews'''
reviews = browser.find_elements_by_xpath('//div[contains(@class,"sc-1rz2iis-1 iMLmnZ")]')
for y in reviews:
reviews = y.text
reviews_list.append(reviews)
sleep(2)
print("appended 1 review")
'''Break'''
if i == 4 or 8 or 12:
browser.find_element_by_xpath('//*[@id="__next"]/div/div[2]/div/div/div[2]/div[3]/ul/div/div/ul/li[1]').click()
browser.find_element_by_xpath('//*[@id="__next"]/div/div[2]/div/div/div[2]/div[3]/ul/div/div/ul/li[11]').click()
sleep(3)
'''Automation of getting to the next page'''
sleep(10)
browser.find_element_by_xpath('//*[@id="__next"]/div/div[2]/div/div/div[2]/div[3]/ul/div/div/ul/li[11]').click()
sleep(8)
print("going to the next page")
Use
WebDriverWait
to wait specific conditions instead of sleep. Tested code:To expand all scrape all the Reviews clicking for the next page till the end you you have to induce WebDriverWait for the
visibility_of_all_elements_located()
and you can use either of the following Locator Strategies:Code Block:
Console Output: