I'm having trouble parsing html elements with "class" attribute using Beautifulsoup. The code looks like this
soup = BeautifulSoup(sdata)
mydivs = soup.findAll('div')
for div in mydivs:
if (div["class"]=="stylelistrow"):
print div
I get an error on the same line "after" the script finishes.
File "./beautifulcoding.py", line 130, in getlanguage
if (div["class"]=="stylelistrow"):
File "/usr/local/lib/python2.6/dist-packages/BeautifulSoup.py", line 599, in __getitem__
return self._getAttrMap()[key]
KeyError: 'class'
How do I get rid or this error?
You can easily find by one class, but if you want to find by the intersection of two classes, it's a little more difficult,
From the documentation (emphasis added):
To be clear, this selects only the p tags that are both strikeout and body class.
To find for the intersection of any in a set of classes (not the intersection, but the union), you can give a list to the
class_
keyword argument (as of 4.1.2):Also note that findAll has been renamed from the camelCase to the more Pythonic
find_all
.Update: 2016 In the latest version of beautifulsoup, the method 'findAll' has been renamed to 'find_all'. Link to official documentation
Hence the answer will be
Specific to BeautifulSoup 3:
Will find all of these: