Parse string to URL

2019-01-11 14:14发布

How can I parse dynamic string values in order to create URL instances? I need to replace spaces with %20, accents, non-ASCII characters...?

I tried to use URLEncoder but it also encodes / character and if I give a string encoded with URLEncoder to the URL constructor I get a MalformedURLException (no protocol).

2条回答
女痞
2楼-- · 2019-01-11 14:33

So what you're saying is that you want to encode part of your URL but not the whole thing. Sounds to me like you'll have to break it up into parts, pass the ones that you want encoded through the encoder, and re-assemble it to get your whole URL.

查看更多
小情绪 Triste *
3楼-- · 2019-01-11 14:49

URLEncoder has a very misleading name. It is according to the Javadocs used encode form parameters using MIME type application/x-www-form-urlencoded.

With this said it can be used to encode e.g., query parameters. For instance if a parameter looks like &/?# its encoded equivalent can be used as:

String url = "http://host.com/?key=" + URLEncoder.encode("&/?#");

Unless you have those special needs the URL javadocs suggests using new URI(..).toURL which performs URI encoding according to RFC2396.

The recommended way to manage the encoding and decoding of URLs is to use URI

The following sample

new URI("http", "host.com", "/path/", "key=| ?/#ä", "fragment").toURL();

produces the result http://host.com/path/?key=%7C%20?/%23ä#fragment. Note how characters such as ?&/ are not encoded.

For further information, see the posts HTTP URL Address Encoding in Java or how to encode URL to avoid special characters in java.


EDIT

Since your input is a string URL, using one of the parameterized constructor of URI will not help you. Neither can you use new URI(strUrl) directly since it doesn't quote URL parameters.

So at this stage we must use a trick to get what you want:

public URL parseUrl(String s) throws Exception {
     URL u = new URL(s);
     return new URI(
            u.getProtocol(), 
            u.getAuthority(), 
            u.getPath(),
            u.getQuery(), 
            u.getRef()).
            toURL();
}

Before you can use this routine you have to sanitize your string to ensure it represents an absolute URL. I see two approaches to this:

  1. Guessing. Prepend http:// to the string unless it's already present.

  2. Construct the URI from a context using new URL(URL context, String spec)

查看更多
登录 后发表回答