I am currently trying to use selenium and BeautifulSoup to retrieve all iframe tags from a website. The problem is I am not getting all the iframes because there are inner html documents within the webpage that BS4 is not searching through and I don't believe the javascript is being executed within the HTML so there may be some HTML elements that aren't getting rendered. Is there a web scraping tool that would allow me to request a url, retrieve the fully js rendered HTML file then search through the DOM and get all tags matching iframe, even in the inner HTML code.
Basically I am able to see all the tags I want within the chrome inspector tool but they are not showing up in the list retrieved from find_all('iframe') function in BS4.
Here is the code I have:
from bs4 import BeautifulSoup
import requests
from selenium import webdriver
browser = webdriver.Chrome('C:/Users/G/chromedriver.exe')
browser.get("https://reddit.com")
HTML = browser.page_source
innerHTML = browser.execute_script("return document.body.innerHTML")
page = BeautifulSoup(innerHTML, 'html.parser')
for iframe in page.find_all('iframe'):
print(iframe)
browser.close()
You can get all the
<iframe>
tags exclusively throughSelenium
with the following code block :The output on my console is :