beautifulsoup findAll find_all

2019-01-05 04:48发布

I would like to parse a html file with Python, and the module I used is beautifulsoup.

After I used it, something strange happened.It is said that the function "find_all" is

same as "findAll", but I've tried both of them. But it is different.

Can anyone tell me the different?

import urllib, urllib2, cookielib
from BeautifulSoup import *
site = "http://share.dmhy.org/topics/list?keyword=TARI+TARI+team_id%3A407"

rqstr = urllib2.Request(site)
rq = urllib2.urlopen(rqstr)
fchData = rq.read()

soup = BeautifulSoup(fchData)

t = soup.findAll('tr')
print t

标签： python xml-parsing html-parsing beautifulsoup

2条回答

Evening l夕情丶

2楼-- · 2019-01-05 05:28

In BeautifulSoup version 4, the methods are exactly the same; the mixed-case versions (findAll, findAllNext, nextSibling, etc.) have all been renamed to conform to the Python style guide, but the old names are still available to make porting easier. See Method Names for a full list.

In new code, you should use the lowercase versions, so find_all, etc.

In your example however, you are using BeautifulSoup version 3 (discontinued since March 2012, don't use it if you can help it), where only findAll() is available. Unknown attribute names (such as .find_all, which only is available in BeautifulSoup 4) are treated as if you are searching for a tag by that name. There is no <find_all> tag in your document, so None is returned for that.

0人赞添加讨论(0) 举报

够拽才男人

3楼-- · 2019-01-05 05:34

from the source code of BeautifulSoup:

http://bazaar.launchpad.net/~leonardr/beautifulsoup/bs4/view/head:/bs4/element.py#L1260

def find_all(self, name=None, attrs={}, recursive=True, text=None,
                 limit=None, **kwargs):
# ...
# ...

findAll = find_all       # BS3
findChildren = find_all  # BS2

0人赞添加讨论(0) 举报

beautifulsoup findAll find_all

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间