I want to search through a html file and then get the url to an image on that page. This url should then be saved as a string - thats all. The problem is I really don t know how to start.
My app of course knows the url to the page where the image is located.
As an example lets take this url:
On this page I need the url of the big image as string. When I view the sourcecode I can locate the url but I dont know how to code that - this is the url I need:
(the text within the quotation marks only).
Use JSoup. It's a HTML parser that will allow you to access DOM elements using css selectors (like jQuery).
// Parse your HTML:
// 1. From string:
Document doc = JSoup.parse(htmlAsString);
// 2. Or from an URL:
Document doc = JSoup.connect("http://my.awesome.site.com/").get();
// Then select images inside it:
Elements images = doc.select("img");
// Then iterate
for (Element el : images) {
String imageUrl = el.attr("src");
// TODO: Do something with the URL
}
Take a look at jsoup HTML parser. There is a relevant answer on SO that explains the basic usage of jsoup - https://stackoverflow.com/a/5318771/1321873
Okay this did the job :) I am getting the image url now:
public class jSoupEx {
private static final String elements = null;
public static void main(String args[]){
try {
Document doc = Jsoup.connect("http://***/index.php/Datei:***.jpg").get();
Element image = doc.select("img").first();
String url = image.absUrl("src");
System.out.println(url);
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}