Can't read data from url due to cloudflare

2020-03-30 03:29发布

问题:

Whenever I compile, i get this:

Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: the link at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(Unknown Source) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(Unknown Source) at java.net.URL.openStream(Unknown Source) at readdata.aaa.main(aaa.java:15)

My script is:

package readdata;

import java.net.*;
import java.io.*;
import java.util.regex.Pattern;
import java.util.regex.Matcher;

public class aaa 
{
    public static void main(String[] args) throws Exception {

        URL oracle = new URL(" the link ");
        BufferedReader in = new BufferedReader(
        new InputStreamReader(oracle.openStream()));

        String inputLine;
        StringBuilder a = new StringBuilder();
        while ((inputLine = in.readLine()) != null)
            a.append(inputLine);
        in.close();


        int i = 0;
        Pattern p = Pattern.compile("Open");
        Matcher m = p.matcher( a );
        while (m.find()) {
            i++;
            System.out.println(i);
        }
    }

}

Is there anyway I can bypass the cloudflare in order to read the data from the URL ?

回答1:

Before

URL oracle = new URL(" the link ");

insert :

System.setProperty("http.agent", "Chrome");

That's probably because CloudFlare prevent from unknown agent requests so this code set the User-Agent to Chrome who is recognized by CloudFlare.