Python BeautifulSoup Loop

2020-07-24 03:56发布

问题:

Thanks to this board I have managed to retrieve the name and the price of the item I want using this code:

import urllib2  
from BeautifulSoup import BeautifulSoup
import re

html = urllib2.urlopen('http://www.toolventure.co.uk/hand-tools/saws/').read()

soup = BeautifulSoup(html)
item = re.sub('\s+', ' ', soup.h2.a.text)
price = soup.find('p', '*price').text
price = re.search('\d+\.\d+', price).group(0)

print item, price

This is great as it returns one result perfectly. Moving on I am now trying to retrieve ALL the results on the page. I have been playing around with loops but am very new to this and am unable to work out how to loop it.

Can someone more knowledgeable point me in the right direction?

Many thanks

回答1:

I'd use findAll for this:

soup = BeautifulSoup(html)

mostwant = {'class': 'productlist_mostwanted_item '}
griditem = {'class': 'productlist_grid_item '}

divs = soup.findAll(attrs = mostwant) + soup.findAll(attrs = griditem)

for product in divs:
    item = product.h2.a.text.strip()
    price = re.search('\d+\.\d+', product.findAll('p')[1].text).group(0)
    print(f"{item} - {price}")