HTTP URL Address Encoding in Java

2018-12-31 03:55发布

My Java standalone application gets a URL (which points to a file) from the user and I need to hit it and download it. The problem I am facing is that I am not able to encode the HTTP URL address properly...

Example:

URL:  http://search.barnesandnoble.com/booksearch/first book.pdf

java.net.URLEncoder.encode(url.toString(), "ISO-8859-1");

returns me:

http%3A%2F%2Fsearch.barnesandnoble.com%2Fbooksearch%2Ffirst+book.pdf

But, what I want is

http://search.barnesandnoble.com/booksearch/first%20book.pdf

(space replaced by %20)

I guess URLEncoder is not designed to encode HTTP URLs... The JavaDoc says "Utility class for HTML form encoding"... Is there any other way to do this?

24条回答
皆成旧梦
2楼-- · 2018-12-31 04:42

How about:

public String UrlEncode(String in_) {

String retVal = "";

try {
    retVal = URLEncoder.encode(in_, "UTF8");
} catch (UnsupportedEncodingException ex) {
    Log.get().exception(Log.Level.Error, "urlEncode ", ex);
}

return retVal;

}

查看更多
长期被迫恋爱
3楼-- · 2018-12-31 04:43

I'm going to add one suggestion here aimed at Android users. You can do this which avoids having to get any external libraries. Also, all the search/replace characters solutions suggested in some of the answers above are perilous and should be avoided.

Give this a try:

String urlStr = "http://abc.dev.domain.com/0007AC/ads/800x480 15sec h.264.mp4";
URL url = new URL(urlStr);
URI uri = new URI(url.getProtocol(), url.getUserInfo(), url.getHost(), url.getPort(), url.getPath(), url.getQuery(), url.getRef());
url = uri.toURL();

You can see that in this particular URL, I need to have those spaces encoded so that I can use it for a request.

This takes advantage of a couple features available to you in Android classes. First, the URL class can break a url into its proper components so there is no need for you to do any string search/replace work. Secondly, this approach takes advantage of the URI class feature of properly escaping components when you construct a URI via components rather than from a single string.

The beauty of this approach is that you can take any valid url string and have it work without needing any special knowledge of it yourself.

查看更多
后来的你喜欢了谁
4楼-- · 2018-12-31 04:44

Use the following standard Java solution (passes around 100 of the testcases provided by Web Plattform Tests):

0. Test if URL is already encoded.

1. Split URL into structural parts. Use java.net.URL for it.

2. Encode each structural part properly!

3. Use IDN.toASCII(putDomainNameHere) to Punycode encode the host name!

4. Use java.net.URI.toASCIIString() to percent-encode, NFC encoded unicode - (better would be NFKC!).

Find more here: https://stackoverflow.com/a/49796882/1485527

查看更多
时光乱了年华
5楼-- · 2018-12-31 04:44

I've created a new project to help construct HTTP URLs. The library will automatically URL encode path segments and query parameters.

You can view the source and download a binary at https://github.com/Widen/urlbuilder

The example URL in this question:

new UrlBuilder("search.barnesandnoble.com", "booksearch/first book.pdf").toString()

produces

http://search.barnesandnoble.com/booksearch/first%20book.pdf

查看更多
十年一品温如言
6楼-- · 2018-12-31 04:44

URLEncoding can encode HTTP URLs just fine, as you've unfortunately discovered. The string you passed in, "http://search.barnesandnoble.com/booksearch/first book.pdf", was correctly and completely encoded into a URL-encoded form. You could pass that entire long string of gobbledigook that you got back as a parameter in a URL, and it could be decoded back into exactly the string you passed in.

It sounds like you want to do something a little different than passing the entire URL as a parameter. From what I gather, you're trying to create a search URL that looks like "http://search.barnesandnoble.com/booksearch/whateverTheUserPassesIn". The only thing that you need to encode is the "whateverTheUserPassesIn" bit, so perhaps all you need to do is something like this:

String url = "http://search.barnesandnoble.com/booksearch/" + 
       URLEncoder.encode(userInput,"UTF-8");

That should produce something rather more valid for you.

查看更多
旧时光的记忆
7楼-- · 2018-12-31 04:45

I had the same problem. Solved this by unsing:

android.net.Uri.encode(urlString, ":/");

It encodes the string but skips ":" and "/".

查看更多
登录 后发表回答