I want to get only:
http://tamilblog.ishafoundation.org/nalvazhvu/vazhkai/
and not all these:
<a href="http://tamilblog.ishafoundation.org/nalvazhvu/vazhkai/"></a>
I just want to apply this to my loop (section):
import java.io.IOException;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class NewClassssssss {
public static void main(String[] args) throws IOException {
Document doc = Jsoup.connect("http://tamilblog.ishafoundation.org/page/3//").get();
Elements section = doc.select("section#content");
Elements article = section.select("article");
Elements links = doc.select("a[href]");
for (Element a : section) {
// System.out.println("Title : \n" + a.select("a").text());
System.out.println(a.select("a[href]"));
}
System.out.println(links);
}
}
There are some problems in the code:
1. Invalid search scope
Elements links = doc.select("a[href]");
The above line gets all links from the whole document instead of the articles only.
2. Invalid node used in loop
for (Element a : section) {
// ...
}
The above for loop works on the sections instead of the links.
3. Repetitive calls to select
method
Elements section = doc.select("section#content");
Elements article = section.select("article");
Elements links = doc.select("a[href]");
It's not necessary to perform a selection for each node in the hierarchy. Jsoup can navigate through it for you. Those three lines can be replaced with one line:
Elements links = doc.select("section#content article a");
SAMPLE CODE
Here is a sample code resuming all the three precedent points:
Document doc = Jsoup.connect("http://tamilblog.ishafoundation.org/nalvazhvu/vazhkai/").get();
for (Element a : doc.select("section#content article a")) {
System.out.println("Title : \n" + a.text());
System.out.println(a.absUrl("href")); // absUrl is used here for *always* having absolute urls.
}
OUTPUT
Title :
http://tamilblog.ishafoundation.org/kalyana-parisaga-isha-kaattupoo/
Title :
இதயம் பேசுகிறது
http://tamilblog.ishafoundation.org/isha-pakkam/idhyam-pesugiradhu/
Title :
வாழ்க்கை
http://tamilblog.ishafoundation.org/nalvazhvu/vazhkai/
Title :
கல்யாணப் பரிசாக ஈஷா காட்டுப்பூ…
http://tamilblog.ishafoundation.org/kalyana-parisaga-isha-kaattupoo/
... (truncated for brievety)