Im trying to clean a post string used in an ajax request (sanitize before db query) to allow only alphanumeric characters, spaces (1 per word, not multiple), can contain "-", and latin characters like "ç" and "é" without success, can anyone help or point me on the right direction?
This is the regex I'm using so far:
$string = preg_replace('/^[a-z0-9 àáâãäåçèéêëìíîïðñòóôõöøùúû-]+$/', '', mb_strtolower(utf8_encode($_POST['q'])));
Thank you.
Why not just use mysql_real_escape_string?
should do the trick. Note that
\w
- matches alphanumerics\p{L}
- matches a single Unicode Code Point in the 'Letters' category (see the Unicode Categories section here).-
at the end of the character class matches a single hyphen.^
in the character classes negates the character class, so that the regex will match the opposite of the character class (anything you do not specify).+
outside of the character class says match 1 or more characters^
and$
outside of the character class will cause the engine to only accept matches that start at the beginning of a line and goes until the end of the line.After the pattern, the
i
modifier says ignore case and theu
tells the pattern matching engine that we're going to be sending UTF8 data it's way, andg
modifier originally present has been removed since it's not necessary in PHP (instead global matching is dependent on which matching function is called)