I'm trying to resolve a relative link that starts with a question mark ?
using Java's URL
or URI
classes.
HTML example:
<a href="?test=xyz">Test XYZ</a>
Code examples (from Scala REPL):
import java.net._
scala> new URL(new URL("http://abc.com.br/index.php?hello=world"), "?test=xyz").toExternalForm()
res30: String = http://abc.com.br/?test=xyz
scala> (new URI("http://abc.com.br/index.php?hello=world")).resolve("?test=xyz").toString
res31: java.net.URI = http://abc.com.br/?test=xyz
The problem is that browsers (tested on Chrome, Firefox and Safari) output the following URL instead: http://abc.com.br/index.php?hello=world
. It doesn't discard the path "index.php". It just replaces the query string part.
And it seems that browsers are just following the especification as explained in https://stackoverflow.com/a/7872230/40876.
Jsoup library makes the same "mistake" when we use element.absUrl("href")
as it also depends on java's URL
resolving.
So what's up with java's URL/URI
resolving relative paths? Is it wrong/incomplete?
How to make it behave the same as the browsers implementation?
This will work just fine:
URIUtils live in org.apache.httpcomponents:httpclient version 4.0 or higher.