Replacement of nth-child to nth-of-type gives an u

2019-08-21 02:41发布

I am trying to get some information from the following web-page with BeautifulSoup:

url = 'https://web.archive.org/web/20071001215911/http://finance.rambler.ru'

With the help of my browser (Chrome), I copy the selector for the desired element:

selector = 'body > div.fe_global > table:nth-child(6) > tbody > tr > td:nth-child(2) > table > tbody > tr > td.fe_col-left > div:nth-child(5) > table > tbody'

However, bs4 does not support nth-child, thus I replace it with nth-of-type:

selector = selector.replace('child', 'of-type')

and apply it to the soup

r = requests.get(url)
soup = BeautifulSoup(r.content, 'lxml')
selected_element = soup.select(selector=selector)

print (selected_element)

the output is []. I expected to get some HTML code instead. What is the cause of such an answer? Thank you for your help.

1条回答
甜甜的少女心
2楼-- · 2019-08-21 03:07

In selected div it has 2 table and I will select second table

from bs4 import BeautifulSoup
import requests

url = 'https://web.archive.org/web/20071001215911/http://finance.rambler.ru'
heads = {'User-Agent' : 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0'}
r = requests.get(url, headers=heads)
soup = BeautifulSoup(r.text, 'html.parser')
selected_element = soup.select('div[class="fe_small fe_l2"] table')[1]

print (selected_element)
查看更多
登录 后发表回答