How to find elements by class

2019-01-01 11:57发布

I'm having trouble parsing html elements with "class" attribute using Beautifulsoup. The code looks like this

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs: 
    if (div["class"]=="stylelistrow"):
        print div

I get an error on the same line "after" the script finishes.

File "./beautifulcoding.py", line 130, in getlanguage
  if (div["class"]=="stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/BeautifulSoup.py", line 599, in __getitem__
   return self._getAttrMap()[key]
KeyError: 'class'

How do I get rid or this error?

9条回答
孤独总比滥情好
2楼-- · 2019-01-01 12:25

From the documentation:

As of Beautiful Soup 4.1.2, you can search by CSS class using the keyword argument class_:

soup.find_all("a", class_="sister")

Which in this case would be:

soup.find_all("div", class_="stylelistrow")

It would also work for:

soup.find_all("div", class_="stylelistrowone stylelistrowtwo")
查看更多
爱死公子算了
3楼-- · 2019-01-01 12:27

This works for me to access the class attribute (on beautifulsoup 4, contrary to what the documentation says). The KeyError comes a list being returned not a dictionary.

for hit in soup.findAll(name='span'):
    print hit.contents[1]['class']
查看更多
姐姐魅力值爆表
4楼-- · 2019-01-01 12:29

You can refine your search to only find those divs with a given class using BS3:

mydivs = soup.findAll("div", {"class": "stylelistrow"})
查看更多
梦该遗忘
5楼-- · 2019-01-01 12:29

Try to check if the div has a class attribute first, like this:

soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs:
    if "class" in div:
        if (div["class"]=="stylelistrow"):
            print div
查看更多
不流泪的眼
6楼-- · 2019-01-01 12:30

This worked for me:

for div in mydivs:
    try:
        clazz = div["class"]
    except KeyError:
        clazz = ""
    if (clazz == "stylelistrow"):
        print div
查看更多
闭嘴吧你
7楼-- · 2019-01-01 12:31

A straight forward way would be :

soup = BeautifulSoup(sdata)
for each_div in soup.findAll('div',{'class':'stylelist'}):
    print each_div

Make sure you take of the casing of findAll, its not findall

查看更多
登录 后发表回答