I have always been confused with URL/HTML Encoding/Escaping. I am using PHP, so want to clear somethings up.
Can I say that I should always use
urlencode
: for individual query string parts$url = 'http://test.com?param1=' . urlencode('some data') . '¶m2=' . urlencode('something else');
htmlentities
: for escaping special characters like<>
so that if will be rendered properly by the browser
Would there be any other places I might use each function. I am not good at all these escaping stuff, always confused by them
First off, you shouldn't be using
htmlentites
around 99% of the time. Instead, you should usehtmlspecialchars()
for escaping text for use inside xml/html documents.htmlentities
are only useful for displaying characters that the native characterset you're using can't display (it is useful if your pages are in ASCII, but you have some UTF-8 characters you would like to display). Instead, just make the whole page UTF-8 (it's not hard), and be done with it.As far as
urlencode
, you hit the nail on the head.So, to recap:
Inside HTML:
Inside of a url:
That's about right. Although -
htmlspecialchars
is fine, as long as you get your charsets straight. Which you should do anyway. So I tend to use that, so I would find out early if I had messed it up.Also note that if you put an url into a html context (say - in the
href
of ana
-tag), you need to escape that. So you'll often see something like: