How to get a specific frame in a web page and retr

2019-04-02 15:06发布

问题:

I wanted to access the translation results of the following url

http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http%3A%2F%2Fwww.saltycrane.com%2Fblog%2F2008%2F10%2Fhow-escape-percent-encode-url-python%2F

the translation is displayed in the bottom content frame out of the two frames. I am interested in retrieving only the bottom content frame to get the translations

selenium for python allows us to fetch page contents via web automation:

browser.get('http://translate.google.com/#en/ar/'+hurl)

The required frame is an iframe :

<div id="contentframe" style="top:160px"><iframe   src="/translate_p?hl=en&am... name=c frameborder="0" style="height:100%;width:100%;position:absolute;top:0px;bottom:0px;"></div></iframe>

but how to get the bottom content frame element to retrieve the translations using web automation?

Came to know that PyQuery also allows us to browse the contents using the JQuery formalism

Update:

An answer mentioned that Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source

but it does not work in the above example. It returns an empty page .

回答1:

You can use driver.switchTo.frame(1); here, the digit 1 inside frame() is the index of frames present in the webpage. as your requirement is to switch to second frame and the index starts with 0, you should use driver.switchTo.frame(1);

But the above code is in Java. In Python, you can use the below line.

driver.switch_to_frame(1);

UPDATE

 driver.get("http://translate.google.com/translate?hl=en&sl=en&tl=ar&u=http://www.saltycrane.com/blog/2008/10/how-escape-percent-encode-url-python/");
 driver.switchTo().frame(0);
 System.out.println(driver.findElement(By.xpath("/html/body/div/div/div[3]/h1/span/a")).getText());

Output: SaltyCrane ???????

I have just tried to print the title name SaltCrane that is present inside the iframe. It worked for me except for the ? symbols after the SaltCrane. As it was arabic, it was unable to decode the same.

The above code is in Java. Same logic should also work in Python.



回答2:

Selenium provides a method where you can do that.

frame = browser.find_element_by_tag_name('iframe')
browser.switch_to_frame(frame)
# get page source
browser.page_source