I am trying to extract the content of a single "value" attribute in a specific "input" tag on a webpage. I use the following code:
import urllib
f = urllib.urlopen("http://58.68.130.147")
s = f.read()
f.close()
from BeautifulSoup import BeautifulStoneSoup
soup = BeautifulStoneSoup(s)
inputTag = soup.findAll(attrs={"name" : "stainfo"})
output = inputTag['value']
print str(output)
I get a TypeError: list indices must be integers, not str
even though from the Beautifulsoup documentation i understand that strings should not be a problem here... but i a no specialist and i may have misunderstood.
Any suggestion is greatly appreciated! Thanks in advance.
In
Python 3.x
, simply useget(attr_name)
on your tag object that you get usingfind_all
:against XML file
conf//test1.xml
that looks like:prints:
I would actually suggest you a time saving way to go with this assuming that you know what kind of tags have those attributes.
suppose say a tag xyz has that attritube named "staininfo"..
And i wan't you to understand that full_tag is a list
Thus you can get all the attrb values of staininfo for all the tags xyz
.findAll()
returns list of all found elements, so:inputTag
is a list (probably containing only one element). Depending on what you want exactly you either should do:or use
.find()
method which returns only one (first) found element:you can also use this :
If you want to retrieve multiple values of attributes from the source above, you can use
findAll
and a list comprehension to get everything you need: