I want to scrape the link and title of all the questions on the page https://www.reddit.com/search?q=Expiration&type=link&sort=new. An element has the following structure:
<a data-click-id="body" class="SQnoC3ObvgnGjWt90zD9Z" href="/r/excel/comments/ayiahc/calculating_expiration_dates_previous_solution_no/">
<h2 class="s1okktje-0 cDxKta">
<span style="font-weight:normal">Calculating Expiration Dates - Previous Solution No Longer Works</span>
</h2>
</a>
I use questions = driver.find_elements_by_xpath('//a[@data-click-id="body"]')
to get the questions then iterate them by for
. And I coud use question.get_attribute('href')
to get the link.
However, I don't know how to extract the title inside the span
(from a question
).
Does anyone know how to do this?
In selenium
will return the text element of the underlying span element in your for loop
with lxml
To scrape the title and href attributes of all the questions on the webpage you need to induce WebDriverWait for the
visibility_of_all_elements_located()
and you can use the following solution:Code Block:
Console Output:
try the below.
or simply