webextension: Why does the browser add a trailing

2019-05-29 03:55发布

When I make a request to http://www.example.com, why does I see http://www.example.com/ in the webRequest.onBeforeRequestListener?

For example:

chrome.webRequest.onBeforeRequest.addListener(
  details => console.log('Sending request to', details.url),
  { urls: ['<all_urls>'] });
fetch('http://www.example.com');

will print

Sending request to http://www.example.com/

That is consistent with the request URL shown in the network request monitor. For example, if I take it and convert it to a curl command, the request looks like this:

curl 'http://www.example.com/' -H 'Accept: */*' -H 'Connection: keep-alive'
    -H 'Accept-Encoding: gzip, deflate' -H 'Accept-Language: en-US,en;q=0.9'
    -H 'User-Agent: ...' --compressed

So, the original request that goes out is for http://www.example.com/ not for http://www.example.com. That decision must have been made in the browser, not by the server.

The same behavior also occurs when using XMLHttpRequest instead of fetch. In my example, I used Chrome, but on Firefox it is the same.

Questions:

  • Why does the browser change it automatically? It also happens with other URLs. From my understanding, adding a trailing slash will often work, but in general, it is a breaking change.
  • If I want to filter in the onBeforeRequest listener for the current request to a specific URL, how can you reliably match it? For instance, just checking whether the URLs are identical will fail.
  • Are there more rewrite URL rules in the browser to be aware of?

1条回答
冷血范
2楼-- · 2019-05-29 04:38

Think, I found it. The browser is just fixing an invalid URL.

To cite from Wikipedia, a URL looks like this:

scheme:[//[user[:password]@]host[:port]][/path][?query][#fragment]

The path must begin with a single slash (/) if an authority part was present, and may also if one was not, but must not begin with a double slash. The path is always defined, though the defined path may be empty (zero length), therefore no trailing slash.

http://example.com has an authority part (in this example, the schema plus hostname: http://example.com), but that leaves the path empty. According to the specification, the path must start with a /, so the browser fixes it by replacing the empty path by /.

If you use a valid URL instead, like http://example.com/abc, it does not need to modify it.

查看更多
登录 后发表回答