I am trying to read all <div>
s with the class = "listitem show-in-category"
from an HTML page. Furthermore, I want to get the itemprop
fields like itemprop="description"
and itemprop="image"
under each . An example of the <div>
is below:
<div class="listitem show-in-category">
<img itemprop="image" alt="For the Trumpets Shall Sound" class="poster pull-left" src=
"/images/shows/rectangle-poster/resized/188x282/4423-1390493634-forthetrumpets-rec.jpg"
title="For the Trumpets Shall Sound">
<h2 itemprop="name"><a href=
"http://www.londonboxoffice.co.uk/for-the-trumpets-shall-sound-tickets">
For the Trumpets Shall Sound</a></h2>
<p itemprop="description"><p>Ruth is clearing out her Mother's attic, with the help of her son
Jamie, when they make an interesting discovery.<br>
<br></p>
</div>
On the JSOUP selector page, it is stated that I can access all divs by class name as:
Elements mydesiredclass = doc.select("div.class")
1) However, for the class name above this does not work, probably since the class name has spaces? What syntax should I use to get all divs with the given class name?
2) Also, once I manage to get all divs and am looping through them, how can I get their description
and img
properties?
The spaces in the class name are actually separator between class names. So the div is part of two classes, namely
listitem
andshow-in-category
If you must select only elements that match both classes, you can do this:
The dot followed by a class name is the css selector for a class. They can be concatenated and that results in adding this class as a requirement to match the element.
About your second question: Jsoup can easily get the img element for you. Suppose that myDivEl is one of your divs
UPDATE after question was edited to point out what the OP wants: