Selenium/Python - Extract dynamically generated HT

2019-07-02 15:57发布

The web page I am trying to access is using JavaScript to dynamically generate HTML form(this one: https://imgur.com/a/rhmXB ). When typing print(page_source), the table seems to appear in the HTML being outputted.

However, after filling the input field and submitting the form, another input field with CAPTCHA image appears(as shown here: https://imgur.com/a/xVfBS ). After typing print(page_source), the input form with the CAPTCHA seems not to be inserted into the HTML.

My question is: How can I access this dynamically generated HTML, which contains the input field and the CAPTCHA image using Selenium?

Here is my code(also, in pastebin: https://pastebin.com/ULSsmbZq ):

from selenium import webdriver
driver = webdriver.Chrome("/var/chromedriver/chromedriver")

URL = 'http://nap.bg/link?id=104'
driver.get(URL)

input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()
print(driver.page_source)

1条回答
叼着烟拽天下
2楼-- · 2019-07-02 16:48

After you click on the button, the page takes some time to load the CAPTCHA and other content. You'll need to wait for that to finish loading. You can do that using Selenium's explicit waits.

This is an example for what you can do:

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
URL = 'http://nap.bg/link?id=104'
driver.get(URL)

input_field = driver.find_element_by_name('ipID')
input_field.send_keys('0000000000')
driver.find_element_by_id('idSubmit').click()

wait = WebDriverWait(driver, 10)
wait.until(EC.element_to_be_clickable((By.NAME, 'ipResponse')))

print(driver.page_source)
查看更多
登录 后发表回答