I am using jsoup
for reading a web page by the following function.
public Document getDocuement(String url){
Document doc = null;
try {
doc = Jsoup.connect(url).timeout(20*1000).userAgent("Mozilla").get();
} catch (Exception e) {
return null;
}
return doc;
}
But whenever i am trying to read a web page that contain javascript
generated contents, jsoup
does not read those contents. ie, the actual content of the page is loading by some javascript
calls.So it is not present in the page source of that link. For example, this blog: http://blog.rapporter.net/search/label/r. Is there a way to get also javascript generated content when parsing page with Jsoup
? If no please suggest any java html parser that can solve this problem..
You cannot do this with Jsoup. Jsoup parses HTML, to wait for AJAX requests or JavaScript content in general you would need a browser which could execute this JavaScript in order to get some output from it. JavaScript logic can be complex, so executing JavaScript and loading content is not a trivial thing (just take a look at how complicated browsers, JS and the DOM are).