Read url to string in few lines of java code

2019-01-01 15:11发布

问题:

I\'m trying to find Java\'s equivalent to Groovy\'s:

String content = \"http://www.google.com\".toURL().getText();

I want to read content from a URL into string. I don\'t want to pollute my code with buffered streams and loops for such a simple task. I looked into apache\'s HttpClient but I also don\'t see a one or two line implementation.

回答1:

Now that some time has passed since the original answer was accepted, there\'s a better approach:

String out = new Scanner(new URL(\"http://www.google.com\").openStream(), \"UTF-8\").useDelimiter(\"\\\\A\").next();

If you want a slightly fuller implementation, which is not a single line, do this:

public static String readStringFromURL(String requestURL) throws IOException
{
    try (Scanner scanner = new Scanner(new URL(requestURL).openStream(),
            StandardCharsets.UTF_8.toString()))
    {
        scanner.useDelimiter(\"\\\\A\");
        return scanner.hasNext() ? scanner.next() : \"\";
    }
}


回答2:

This answer refers to an older version of Java. You may want to look at ccleve\'s answer.


Here is the traditional way to do this:

import java.net.*;
import java.io.*;

public class URLConnectionReader {
    public static String getText(String url) throws Exception {
        URL website = new URL(url);
        URLConnection connection = website.openConnection();
        BufferedReader in = new BufferedReader(
                                new InputStreamReader(
                                    connection.getInputStream()));

        StringBuilder response = new StringBuilder();
        String inputLine;

        while ((inputLine = in.readLine()) != null) 
            response.append(inputLine);

        in.close();

        return response.toString();
    }

    public static void main(String[] args) throws Exception {
        String content = URLConnectionReader.getText(args[0]);
        System.out.println(content);
    }
}

As @extraneon has suggested, ioutils allows you to do this in a very eloquent way that\'s still in the Java spirit:

 InputStream in = new URL( \"http://jakarta.apache.org\" ).openStream();

 try {
   System.out.println( IOUtils.toString( in ) );
 } finally {
   IOUtils.closeQuietly(in);
 }


回答3:

Or just use IOUtils.toString(URL url), or the variant that also accepts an encoding parameter.



回答4:

Now that more time has passed, here\'s a way to do it in Java 8:

URLConnection conn = url.openConnection();
try (BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), StandardCharsets.UTF_8))) {
    pageText = reader.lines().collect(Collectors.joining(\"\\n\"));
}


回答5:

Additional example using Guava:

URL xmlData = ...
String data = Resources.toString(xmlData, Charsets.UTF_8);


回答6:

There\'s an even better way as of Java 9:

URL u = new URL(\"http://www.example.com/\");
try (InputStream in = u.openStream()) {
    return new String(in.readAllBytes(), StandardCharsets.UTF_8);
}

Like the original groovy example, this assumes that the content is UTF-8 encoded. (If you need something more clever than that, you need to create a URLConnection and use it to figure out the encoding.)



回答7:

If you have the input stream (see Joe\'s answer) also consider ioutils.toString( inputstream ).

http://commons.apache.org/io/api-1.4/org/apache/commons/io/IOUtils.html#toString(java.io.InputStream)



回答8:

The following works with Java 7/8, secure urls, and shows how to add a cookie to your request as well. Note this is mostly a direct copy of this other great answer on this page, but added the cookie example, and clarification in that it works with secure urls as well ;-)

If you need to connect to a server with an invalid certificate or self signed certificate, this will throw security errors unless you import the certificate. If you need this functionality, you could consider the approach detailed in this answer to this related question on StackOverflow.

Example

String result = getUrlAsString(\"https://www.google.com\");
System.out.println(result);

outputs

<!doctype html><html itemscope=\"\" .... etc

Code

import java.net.URL;
import java.net.URLConnection;
import java.io.BufferedReader;
import java.io.InputStreamReader;

public static String getUrlAsString(String url)
{
    try
    {
        URL urlObj = new URL(url);
        URLConnection con = urlObj.openConnection();

        con.setDoOutput(true); // we want the response 
        con.setRequestProperty(\"Cookie\", \"myCookie=test123\");
        con.connect();

        BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));

        StringBuilder response = new StringBuilder();
        String inputLine;

        String newLine = System.getProperty(\"line.separator\");
        while ((inputLine = in.readLine()) != null)
        {
            response.append(inputLine + newLine);
        }

        in.close();

        return response.toString();
    }
    catch (Exception e)
    {
        throw new RuntimeException(e);
    }
}


回答9:

Here\'s Jeanne\'s lovely answer, but wrapped in a tidy function for muppets like me:

private static String getUrl(String aUrl) throws MalformedURLException, IOException
{
    String urlData = \"\";
    URL urlObj = new URL(aUrl);
    URLConnection conn = urlObj.openConnection();
    try (BufferedReader reader = new BufferedReader(new InputStreamReader(conn.getInputStream(), StandardCharsets.UTF_8))) 
    {
        urlData = reader.lines().collect(Collectors.joining(\"\\n\"));
    }
    return urlData;
}


标签: java http url