Could you share a link to an URL parsing implement

2019-01-18 12:01发布

As far as I understand, an URL consists of the folowing fields:

  • Protocol (http, https, ftp, etc.)
  • User name
  • User Password
  • Host address (an IP address or a DNS FQDN)
  • Port (which can be implied)
  • Path to a document inside the server documents root
  • Set of arguments and values
  • Document part (#)

as

protocol://user:password@host:port/path/document?arg1=val1&arg2=val2#part

I need a code to get value (or null/empty value if not set) of any of these fields from any given URL string. Am I to implement this myself or there is already a code for this so I don't need to invent a wheel?

I am particularly interested in Scala or Java code. C#, PHP, Python or Perl code can also be useful.

6条回答
ら.Afraid
3楼-- · 2019-01-18 12:49

Based on @Codemwnci answer, here's a full example to get the filename from a url with or without arguments:

URL videoUrl = new URL("https://somesite.com/path/v/t43.1792-2/1186696120_n.mp4?efg=something");
String videoFileName = videoUrl.getPath().substring(videoUrl.getPath().lastIndexOf("/") + 1);

1186696120_n.mp4

查看更多
爱情/是我丢掉的垃圾
4楼-- · 2019-01-18 12:53

URL does not support ldap by default. One can extend URL and add protocols, but I ended up with a simple parser and a small new class.

查看更多
来,给爷笑一个
5楼-- · 2019-01-18 12:54

Use the java.net.URI class for this. URLs are for real resources and real protocols. URIs are for possibly non-existent protocols and resources.

查看更多
聊天终结者
6楼-- · 2019-01-18 12:54

In Java, just use the URL class. It provides methods such as getProtocol, getHost, etc. to obtain the different parts of the URL.

查看更多
一夜七次
7楼-- · 2019-01-18 12:55

The URL class gives you everything you need. See http://download.oracle.com/javase/6/docs/api/java/net/URL.html

URL url = new URL("protocol://user:password@host:port/path/document?arg1=val1&arg2=val2#part");
url.getProtocol();
url.getUserInfo();
url.getAuthority();
url.getHost();
url.getPort();
url.getPath(); // document part is contained within the path field
url.getQuery();
url.getRef(); // gets #part
查看更多
登录 后发表回答