Nowadays I am using beautiful soup to parse the html page. But sometimes the result I got by find_all is less than the number in pages. For example, this page http://www.totallyfreestuff.com/index.asp?m=0&sb=1&p=5 has 18 headline span. But when i use the following codes, it just got two! Can anybody tell me why. Thank you in advance!
soup = BeautifulSoup(page, 'html.parser')
hrefDivList = soup.find_all("span", class_ = "headline")
#print hrefDivList
print len(hrefDivList)
You can try using different parser for Beautifulsoup.
You can try CSS Selectors to make your life easier
Or you can directly iterate over every Span text