This question already has an answer here:
I want to get the html source code of https://www2.cslb.ca.gov/OnlineServices/CheckLicenseII/LicenseDetail.aspx?LicNum=872423
and for that I am using this method but I am not getting the html source code.
public static String getHTML(URL url) {
HttpURLConnection conn; // The actual connection to the web page
BufferedReader rd; // Used to read results from the web page
String line; // An individual line of the web page HTML
String result = ""; // A long string containing all the HTML
try {
conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod("GET");
rd = new BufferedReader(new InputStreamReader(conn.getInputStream()));
while ((line = rd.readLine()) != null) {
result += line;
}
rd.close();
} catch (Exception e) {
e.printStackTrace();
}
return result;
}
The server filters out Java's default
User-Agent
. This works:Looks like the user agents are black listed. By default my JDK sends:
Note that I'm using
IOUtils
class to simplify example, but the key things is: