Over the years I have slowly developed a regular expression that validates MOST email addresses correctly, assuming they don't use an IP address as the server part.
I use it in several PHP programs, and it works most of the time. However, from time to time I get contacted by someone that is having trouble with a site that uses it, and I end up having to make some adjustment (most recently I realized that I wasn't allowing 4-character TLDs).
What is the best regular expression you have or have seen for validating emails?
I've seen several solutions that use functions that use several shorter expressions, but I'd rather have one long complex expression in a simple function instead of several short expression in a more complex function.
Per the W3C HTML5 spec:
Context:
The HTML5 spec suggests a simple regex for validating email addresses:
This intentionally doesn't comply with RFC 5322.
The total length could also be limited to 254 characters, per RFC 3696 errata 1690.
As you're writing in PHP I'd advice you to use the PHP build-in validation for emails.
If you're running a php-version lower than 5.3.6 please be aware of this issue: https://bugs.php.net/bug.php?id=53091
If you want more information how this buid-in validation works, see here: Does PHP's filter_var FILTER_VALIDATE_EMAIL actually work?
Here's the PHP I use. I've choosen this solution in the spirit of "false positives are better than false negatives" as declared by another commenter here AND with regards to keeping your response time up and server load down ... there's really no need to waste server resources with a regular expression when this will weed out most simple user error. You can always follow this up by sending a test email if you want.
Strange that you "cannot" allow 4 characters TLDs. You are banning people from .info and .name, and the length limitation stop .travel and .museum, but yes, they are less common than 2 characters TLDs and 3 characters TLDs.
You should allow uppercase alphabets too. Email systems will normalize the local part and domain part.
For your regex of domain part, domain name cannot starts with '-' and cannot ends with '-'. Dash can only stays in between.
If you used the PEAR library, check out their mail function (forgot the exact name/library). You can validate email address by calling one function, and it validates the email address according to definition in RFC822.
Cal Henderson (Flickr) wrote an article called Parsing Email Adresses in PHP and shows how to do proper RFC (2)822-compliant Email Address parsing. You can also get the source code in php, python and ruby which is cc licensed.