I'm not asking about full email validation.
I just want to know what are allowed characters in user-name
and server
parts of email address. This may be oversimplified, maybe email adresses can take other forms, but I don't care. I'm asking about only this simple form: user-name@server
(e.g. wild.wezyr@best-server-ever.com) and allowed characters in both parts.
Watch out! There is a bunch of knowledge rot in this thread (stuff that used to be true and now isn't).
To avoid false-positive rejections of actual email addresses in the current and future world, and from anywhere in the world, you need to know at least the high-level concept of RFC 3490, "Internationalizing Domain Names in Applications (IDNA)". I know folks in US and A often aren't up on this, but it's already in widespread and rapidly increasing use around the world (mainly the non-English dominated parts).
The gist is that you can now use addresses like mason@日本.com and wildwezyr@fahrvergnügen.net. No, this isn't yet compatible with everything out there (as many have lamented above, even simple qmail-style +ident addresses are often wrongly rejected). But there is an RFC, there's a spec, it's now backed by the IETF and ICANN, and--more importantly--there's a large and growing number of implementations supporting this improvement that are currently in service.
I didn't know much about this development myself until I moved back to Japan and started seeing email addresses like hei@やる.ca and Amazon URLs like this:
http://www.amazon.co.jp/エレクトロニクス-デジタルカメラ-ポータブルオーディオ/b/ref=topnav_storetab_e?ie=UTF8&node=3210981
I know you don't want links to specs, but if you rely solely on the outdated knowledge of hackers on Internet forums, your email validator will end up rejecting email addresses that non-Enlish users increasingly expect to work. For those users, such validation will be just as annoying as the commonplace brain-dead form that we all hate, the one that can't handle a + or a three-part domain name or whatever.
So I'm not saying it's not a hassle, but the full list of characters "allowed under some/any/none conditions" is (nearly) all characters in all languages. If you want to "accept all valid email addresses (and many invalid too)" then you have to take IDN into account, which basically makes a character-based approach useless (sorry), unless you first convert the internationalized email addresses to Punycode.
After doing that you can follow (most of) the advice above.
Gmail will only allow + sign as special character and in some cases (.) but any other special characters are not allowed at Gmail. RFC's says that you can use special characters but you should avoid sending mail to Gmail with special characters.
In my PHP I use this check
try it yourself http://phpfiddle.org/main/code/9av6-d10r
You can start from wikipedia article:
The accepted answer refers to a Wikipedia article when discussing the valid local-part of an email address, but Wikipedia is not an authority on this.
IETF RFC 3696 is an authority on this matter, and should be consulted at section 3. Restrictions on email addresses on page 5:
As others have done, I submit a regex that works for both PHP and JavaScript to validate email addresses: