Pull Tag Value using BeautifulSoup

2019-04-29 01:28发布

Can someone direct me as how to pull the value of a tag using BeautifulSoup? I read the documentation but had a hard time navigating through it. For example, if I had:

<span title="Funstuff" class="thisClass">Fun Text</span>

How would I just pull "Funstuff" busing BeautifulSoup/Python?

Edit: I am using version 3.2.1

2条回答
做自己的国王
2楼-- · 2019-04-29 02:04

A tags children are available via .contents http://www.crummy.com/software/BeautifulSoup/bs4/doc/#contents-and-children In your case you can find the tag be using its CSS class to extract the contents

from bs4 import BeautifulSoup
soup=BeautifulSoup('<span title="Funstuff" class="thisClass">Fun Text</span>')
soup.select('.thisClass')[0].contents[0]

http://www.crummy.com/software/BeautifulSoup/bs4/doc/#css-selectors has all the details nevessary

查看更多
该账号已被封号
3楼-- · 2019-04-29 02:05

You need to have something to identify the element you're looking for, and it's hard to tell what it is in this question.

For example, both of these will print out 'Funstuff' in BeautifulSoup 3. One looks for a span element and gets the title, another looks for spans with the given class. Many other valid ways to get to this point are possible.

import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup('<html><body><span title="Funstuff" class="thisClass">Fun Text</span></body></html>')
print soup.html.body.span['title']
print soup.find('span', {"class": "thisClass"})['title']
查看更多
登录 后发表回答