How to extract values with BeautifulSoup with no c

2019-08-02 22:48发布

html code :

<td class="_480u">
    <div class="clearfix">
        <div>
            Female
        </div>
    </div>
</td>

I wanted the value "Female" as an output.

I tried bs.findAll('div',{'class':'clearfix'}) ; bs.findAll('tag',{'class':'_480u'}) But these classes are all over my html code and the output is a big list. I wanted to incorporate {td --> class = ".." and div --> class = ".."} in my search, so that I get the output as Female. How can I do this?

Thanks

标签： python parsing python-2.7 html-parsing beautifulsoup

1条回答

叼着烟拽天下

2楼-- · 2019-08-02 23:35

Use stripped_strings property:

>>> from bs4 import BeautifulSoup
>>>
>>> html = '''<td class="_480u">
...     <div class="clearfix">
...         <div>
...             Female
...         </div>
...     </div>
... </td>'''
>>> soup = BeautifulSoup(html)
>>> print ' '.join(soup.find('div', {'class': 'clearfix'}).stripped_strings)
Female
>>> print ' '.join(soup.find('td', {'class': '_480u'}).stripped_strings)
Female

or specify class as empty string (or None) and use string property:

>>> soup.find('div', {'class': ''}).string
u'\n            Female\n        '
>>> soup.find('div', {'class': ''}).string.strip()
u'Female'

0人赞添加讨论(0) 举报

How to extract values with BeautifulSoup with no c

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间