C# Email Regular Expression — Any out there that a

2019-02-07 08:09发布

I realize that there are a ton of regex email validations, but I can't seem to find one that adheres to the RFC 2822 standard.

The ones I find keep letting in junk like ..@abc.com get through.

Forgive me if the one of the questions is already answered adhering to RFC 2822 (but not annotated that it is).

4条回答
唯我独甜
2楼-- · 2019-02-07 08:53

This runs in PCRE: http://code.iamcal.com/php/rfc822/full_regexp.txt

It's 32k, apparently.

Seriously - maybe consider backing off using a single regexp, or accepting ALL possible email forms.

查看更多
冷血范
3楼-- · 2019-02-07 08:56

I would look at this: http://www.regular-expressions.info/email.html, which explains a lot about using regular expressions to match email addresses, and includes a full RFC 2822 expression which honestly I would almost never recommended using.

查看更多
成全新的幸福
4楼-- · 2019-02-07 09:05

This is for RFC822, not for the newer one. But it seems the address format has not been changed, so should be what you're looking for.

(note the remark below the regexp--it still assumes that the address has been preprocessed)

查看更多
放荡不羁爱自由
5楼-- · 2019-02-07 09:07

I did a post on this a short while ago. Yes, it is possible using .NET regex, since they have a non-regular feature called "balancing groups".

The Perl RFC822 one that is often posted doesn't fully match email addresses, since it requires preprocessing to remove comments. It's also for a very old RFC (from 1982!).

This regex is for RFC5322, which is current. It also handles all comments and folding whitespace correctly.

Here is the regex:

^(?'localPart'((((\((((?'paren'\()|(?'-paren'\))|([\u0021-\u
0027\u002a-\u005b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u
000e-\u001f\u007f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|
\\([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c
\u000e-\u001f\u007f]))*(?(paren)(?!)))\))|([ \t]+((\r\n)[ \t
]+)?|((\r\n)[ \t]+)+))*?(([a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)|(
"(([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)?(([\u0021\u0023-\u
005b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u
007f])|\\([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000
b\u000c\u000e-\u001f\u007f])))*([ \t]+((\r\n)[ \t]+)?|((\r\n
)[ \t]+)+)?"))((\((((?'paren'\()|(?'-paren'\))|([\u0021-\u00
27\u002a-\u005b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u00
0e-\u001f\u007f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\
([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u
000e-\u001f\u007f]))*(?(paren)(?!)))\))|([ \t]+((\r\n)[ \t]+
)?|((\r\n)[ \t]+)+))*?)(\.(((\((((?'paren'\()|(?'-paren'\))|
([\u0021-\u0027\u002a-\u005b\u005d-\u007e]|[\u0001-\u0008\u0
00b\u000c\u000e-\u001f\u007f])|([ \t]+((\r\n)[ \t]+)?|((\r\n
)[ \t]+)+)|\\([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\
u000b\u000c\u000e-\u001f\u007f]))*(?(paren)(?!)))\))|([ \t]+
((\r\n)[ \t]+)?|((\r\n)[ \t]+)+))*?(([a-zA-Z0-9!#$%&'*+/=?^_
`{|}~-]+)|("(([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)?(([\u00
21\u0023-\u005b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u00
0e-\u001f\u007f])|\\([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-
\u0008\u000b\u000c\u000e-\u001f\u007f])))*([ \t]+((\r\n)[ \t
]+)?|((\r\n)[ \t]+)+)?"))((\((((?'paren'\()|(?'-paren'\))|([
\u0021-\u0027\u002a-\u005b\u005d-\u007e]|[\u0001-\u0008\u000
b\u000c\u000e-\u001f\u007f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[
\t]+)+)|\\([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u0
00b\u000c\u000e-\u001f\u007f]))*(?(paren)(?!)))\))|([ \t]+((
\r\n)[ \t]+)?|((\r\n)[ \t]+)+))*?))*))@(?'domain'((((\((((?'
paren'\()|(?'-paren'\))|([\u0021-\u0027\u002a-\u005b\u005d-\
u007e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f])|([ \t
]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\([\u0021-\u007e]|[ \t]|
[\r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f]))*(?
(paren)(?!)))\))|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+))*?(
([a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)|("(([ \t]+((\r\n)[ \t]+)?|
((\r\n)[ \t]+)+)?(([\u0021\u0023-\u005b\u005d-\u007e]|[\u000
1-\u0008\u000b\u000c\u000e-\u001f\u007f])|\\([\u0021-\u007e]
|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007
f])))*([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)?"))((\((((?'pa
ren'\()|(?'-paren'\))|([\u0021-\u0027\u002a-\u005b\u005d-\u0
07e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f])|([ \t]+
((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\([\u0021-\u007e]|[ \t]|[\
r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f]))*(?(p
aren)(?!)))\))|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+))*?)(\
.(((\((((?'paren'\()|(?'-paren'\))|([\u0021-\u0027\u002a-\u0
05b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u0
07f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\([\u0021-\u0
07e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\
u007f]))*(?(paren)(?!)))\))|([ \t]+((\r\n)[ \t]+)?|((\r\n)[
\t]+)+))*?(([a-zA-Z0-9!#$%&'*+/=?^_`{|}~-]+)|("(([ \t]+((\r\
n)[ \t]+)?|((\r\n)[ \t]+)+)?(([\u0021\u0023-\u005b\u005d-\u0
07e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f])|\\([\u0
021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-
\u001f\u007f])))*([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)?"))
((\((((?'paren'\()|(?'-paren'\))|([\u0021-\u0027\u002a-\u005
b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007
f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\([\u0021-\u007
e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u0
07f]))*(?(paren)(?!)))\))|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t
]+)+))*?))*)|(((\((((?'paren'\()|(?'-paren'\))|([\u0021-\u00
27\u002a-\u005b\u005d-\u007e]|[\u0001-\u0008\u000b\u000c\u00
0e-\u001f\u007f])|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\
([\u0021-\u007e]|[ \t]|[\r\n\0]|[\u0001-\u0008\u000b\u000c\u
000e-\u001f\u007f]))*(?(paren)(?!)))\))|([ \t]+((\r\n)[ \t]+
)?|((\r\n)[ \t]+)+))*?\[(([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]
+)+)?([!-Z^-~]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f
]))*([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+)?\]((\((((?'paren
'\()|(?'-paren'\))|([\u0021-\u0027\u002a-\u005b\u005d-\u007e
]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f])|([ \t]+((\
r\n)[ \t]+)?|((\r\n)[ \t]+)+)|\\([\u0021-\u007e]|[ \t]|[\r\n
\0]|[\u0001-\u0008\u000b\u000c\u000e-\u001f\u007f]))*(?(pare
n)(?!)))\))|([ \t]+((\r\n)[ \t]+)?|((\r\n)[ \t]+)+))*?))\z

Some caveats, however. RFC5322 is more liberal with domain names than the actual domain RFCs, and there are other restrictions that apply from various RFCs such as the actual SMTP RFC itself (which specifies a maximum length). So even though an email is correct according to 5322 it can be invalid by various other measures.

The golden test is still to send an email to the address with a validation code.

查看更多
登录 后发表回答