Is there a pre-existing function or class for URL normalization in PHP?
Specifically, following the semantic preserving normalization rules laid out in this wikipedia article on URL normalization, (or whatever 'standard' I should be following).
- Converting the scheme and host to lower case
- Capitalizing letters in escape sequences
- Adding trailing / (to directories, not files)
- Removing the default port
- Removing dot-segments
Right now, I'm thinking that I'll just use parse_url()
, and apply the rules individually, but I'd prefer to avoid reinventing the wheel.
The Pear Net_URL2 library looks like it'll do at least part of what you want. It'll remove dot segments, fix capitalization and get rid of the default port:
emits:
I doubt there's a general purpose mechanism for adding trailing slashes to directories because you need a way to map urls to directories which is challenging to do in a generic way. But it's close.
References: