I am writing a web crawler tool in Java. When I type the website name, how can I make it so that it connects to that site in http or https without me defining the protocol?
try {
Jsoup.connect("google.com").get();
} catch (IOException ex) {
Logger.getLogger(LinkGUI.class.getName()).log(Level.SEVERE, null, ex);
}
But I get the error:
java.lang.IllegalArgumentException: Malformed URL: google.com
What can I do? Are there any classes or libraries that do this?
What I'm trying to do is I have a list of 165 Courses, each with 65 - 71 html pages with links all throughout them. I am writing a Java program to test if the link is broken or not.
You can write your own simple method to try both protocols, like:
Then, your original code can be:
Note: you should only use the usesHttps() method once per URL, to figure out which protocol to use. After you know that, you should connect using Jsoup.connect() directly. This will be more efficient.