显示图像的SRC与硒和Python iframe内(Displaying src of image

2019-10-29 12:40发布

我试图用Python和硒从shapeNet自动下载图像。 我几乎有,但最后一步逃避我。

from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By


profile = webdriver.FirefoxProfile()
profile.set_preference("network.proxy.type", 1)
profile.set_preference("network.proxy.socks", "yourproxy")
profile.set_preference("network.proxy.socks_port", number_of_port)
#browser = webdriver.Firefox(firefox_profile=profile)
browser = webdriver.Firefox()

browser.get('https://www.shapenet.org/taxonomy-viewer')
#Page is long to load
wait = WebDriverWait(browser, 30)
element = wait.until(EC.element_to_be_clickable((By.XPATH, "//*[@id='02958343_anchor']")))
linkElem = browser.find_element_by_xpath("//*[@id='02958343_anchor']")
linkElem.click()
#Page is also long to display iframe
element = wait.until(EC.element_to_be_clickable((By.ID, "model_3dw_bcf0b18a19bce6d91ad107790a9e2d51")))
linkElem = browser.find_element_by_id("model_3dw_bcf0b18a19bce6d91ad107790a9e2d51")
linkElem.click()
#iframe slow to be displayed
wait.until(EC.frame_to_be_available_and_switch_to_it((By.ID, 'viewerIframe')))

到现在为止一切顺利,我们正在进入的iframe。 下一行工作,但我不得不使用time.sleep(),使其工作是稍微难看,但我不知道任何替代的,这是不是我的问题的核心:

import time
#does not work have to use time.sleep
#element = wait.until(EC.element_to_be_clickable((By.XPATH, "/html/body/div[3]/div[3]/h4")))
time.sleep(20)
linkElem = browser.find_element_by_xpath("/html/body/div[3]/div[3]/h4")
linkElem.click()

现在我只想下载显示在我的点击打开,我发现使用developper工具的XPath的崩溃菜单的图像之一:

img = browser.find_element_by_xpath("/html/body/div[3]/div[3]/div/div/div/span/img")
src = img.get_attribute('src')

现在,它可以访问IMG SRC却是没有,直到我手动点击该网页。 这是为什么 ? 我究竟做错了什么 ?

PS:最后一个步骤是:

os.system("wget %s --no-check-certificate"%src)

Answer 1:

取而代之的xpath("/html/body/div[3]/div[3]/div/div/div/span/img")你可以使用下面xpath

img = browser.find_element_by_xpath("/html/body/div[3]/div[3]//div[@class='searchResult' and @id='image.3dw.bcf0b18a19bce6d91ad107790a9e2d51.0']/img[@class='enlarge']")
src = img.get_attribute('src')


文章来源: Displaying src of image inside an iframe with selenium and Python