经由LXML的xpath查找从根，而不是元件开始(xpath lookup via lxml sta

2019-10-20 11:44发布

我想这样做我美丽的汤做同样的事情， find_all元素和遍历它们找到每个迭代元素一些other_elements。即：

soup = bs4.BeautifulSoup(source)
articles = soup.find_all('div', class='v-card')
for article in articles:
    name = article.find('span', itemprop='name').text
    address = article.find('p', itemprop='address').text

现在，我尝试做同样的事情在LXML：

tree = html.fromstring(source)
items = tree.xpath('//div[@class="v-card"]')
for item in items:
    name = item.xpath('//span[@itemprop="name"]/text()')
    address = item.xpath('//p[@itemprop="address"]/text()')

......但这发现树中的所有比赛，不管他们是否是根据目前item 。我该如何处理这个？

Answer 1:

不要使用//在后续的查询，其中明确要求查询，以从根而不是当前的元素开始为前缀。相反，使用.//相对查询：

for item in tree.xpath('//div[@class="v-card"]'):
    name = item.xpath('.//span[@itemprop="name"]/text()'
    address = item.xpath('.//p[@itemprop="address"]/text()')

文章来源: xpath lookup via lxml starting from root rather than element