I want to access this webpage: https://www.google.com/trends/explore#q=ice%20cream and extract the data within in the center line graph. The html file is(Here, I only paste the part that I use.):
<div class="center-col">
<div class="comparison-summary-title-line">...</div>
...
<div id="reportContent" class="report-content">
<!-- This tag handles the report titles component -->
...
<div id="report">
<div id="reportMain">
<div class="timeSection">
<div class = "primaryBand timeBand">...</div>
...
<div aria-lable = "one-chart" style = "position: absolute; ...">
<svg ....>
...
<script type="text/javascript">
var chartData = {...}
And the data I used is stored in the script part(last line). My idea is to get the class "report-content" first, and then select script. And my code follows as:
String html = "https://www.google.com/trends/explore#q=ice%20cream";
Document doc = Jsoup.connect(html).get();
Elements center = doc.getElementsByClass("center-col");
Element report = doc.getElementsByClass("report-content");
System.out.println(center);
System.out.println(report);
When I print "center" class, I can get all the subclasses content except the "report-content", and when I print the "report-content", the result is only like:
<div id="reportContent" Class="report-content"></div>
And I also try this:
Element report = doc.select(div.report-content).first();
but still does not work at all. How could I get the data in the script here? I appreciate your help!!!
Trying getting the same by Id, you would get the complete tag
Try this url instead:
where
${keywords}
is an encoded space separated keywords list${timezone}
is an encoded timezone in the Etc/GMT* formDEMO
SAMPLE CODE
References:
<script>
element with Jsoup?