I have a regex email pattern and would like to strip all but pattern-matched characters from the string, in a short I want to sanitize string...
I'm not a regex guru, so what I'm missing in regex?
<?php
$pattern = "/^([\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+\.)*[\w\!\#$\%\&\'\*\+\-\/\=\?\^\`{\|\}\~]+@((((([a-z0-9]{1}[a-z0-9\-]{0,62}[a-z0-9]{1})|[a-z])\.)+[a-z]{2,6})|(\d{1,3}\.){3}\d{1,3}(\:\d{1,5})?)$/i";
$email = 'contact<>@domain.com'; // wrong email
$sanitized_email = preg_replace($pattern, NULL, $email);
echo $sanitized_email; // Should be contact@domain.com
?>
Pattern taken from: http://fightingforalostcause.net/misc/2006/compare-email-regex.php (the very first one...)
You cannot filter and match at the same time. You'll need to break it up into a character class for stripping invalid characters and a matching regular expression which verifies a valid address.
For the first case, you want to use a negated character class
/[^allowedchars]/
.For the second part you use the structure
/^...@...$/
.Have a look at PHPs filter extension. It uses
const unsigned char allowed_list[] = LOWALPHA HIALPHA DIGIT "!#$%&'*+-=?^_\
{|}~@.[]";` for cleansing.And there is the monster for validation: line 525 in http://gcov.php.net/PHP_5_3/lcov_html/filter/logical_filters.c.gcov.php - but check out http://www.regular-expressions.info/email.html for a more common and shorter variant.
i guess filter_var php function can also do this functionality, and in a cleaner way. Have a look at: http://www.php.net/manual/en/function.filter-var.php
example: