How to extract values with BeautifulSoup with no c

2019-08-02 22:48发布

html code :

<td class="_480u">
    <div class="clearfix">
        <div>
            Female
        </div>
    </div>
</td>

I wanted the value "Female" as an output.

I tried bs.findAll('div',{'class':'clearfix'}) ; bs.findAll('tag',{'class':'_480u'}) But these classes are all over my html code and the output is a big list. I wanted to incorporate {td --> class = ".." and div --> class = ".."} in my search, so that I get the output as Female. How can I do this?

Thanks

1条回答
叼着烟拽天下
2楼-- · 2019-08-02 23:35

Use stripped_strings property:

>>> from bs4 import BeautifulSoup
>>>
>>> html = '''<td class="_480u">
...     <div class="clearfix">
...         <div>
...             Female
...         </div>
...     </div>
... </td>'''
>>> soup = BeautifulSoup(html)
>>> print ' '.join(soup.find('div', {'class': 'clearfix'}).stripped_strings)
Female
>>> print ' '.join(soup.find('td', {'class': '_480u'}).stripped_strings)
Female

or specify class as empty string (or None) and use string property:

>>> soup.find('div', {'class': ''}).string
u'\n            Female\n        '
>>> soup.find('div', {'class': ''}).string.strip()
u'Female'
查看更多
登录 后发表回答