Is there a good way to check a form input using regex to make sure it is a proper style email address? Been searching since last night and everybody that has answered peoples questions regarding this topic also seems to have problems with it if it is a subdomained email address.
问题:
回答1:
There is no point. Even if you can verify that the email address is syntactically valid, you\'ll still need to check that it was not mistyped, and that it actually goes to the person you think it does. The only way to do that is to send them an email and have them click a link to verify.
Therefore, a most basic check (e.g. that they didn\'t accidentally entered their street address) is usually enough. Something like: it has exactly one @
sign, and at least one .
in the part after the @
:
[^@]+@[^@]+\\.[^@]+
You\'d probably also want to disallow whitespace -- there are probably valid email addresses with whitespace in them, but I\'ve never seen one, so the odds of this being a user error are on your side.
If you want the full check, have a look at this question.
Update: Here\'s how you could use any such regex:
import re
if not re.match(r\"... regex here ...\", email):
# whatever
Note the r
in front of the string; this way, you won\'t need to escape things twice.
If you have a large number of regexes to check, it might be faster to compile the regex first:
import re
EMAIL_REGEX = re.compile(r\"... regex here ...\")
if not EMAIL_REGEX.match(email):
# whatever
Another option is to use the validate_email
package, which actually contacts the SMTP server to verify that the address exists. This still doesn\'t guarantee that it belongs to the right person, though.
回答2:
The Python standard library comes with an e-mail parsing function: email.utils.parseaddr()
.
It returns a two-tuple containing the real name and the actual address parts of the e-mail:
>>> from email.utils import parseaddr
>>> parseaddr(\'foo@example.com\')
(\'\', \'foo@example.com\')
>>> parseaddr(\'Full Name <full@example.com>\')
(\'Full Name\', \'full@example.com\')
>>> parseaddr(\'\"Full Name with quotes and <weird@chars.com>\" <weird@example.com>\')
(\'Full Name with quotes and <weird@chars.com>\', \'weird@example.com\')
And if the parsing is unsuccessful, it returns a two-tuple of empty strings:
>>> parseaddr(\'[invalid!email]\')
(\'\', \'\')
An issue with this parser is that it\'s accepting of anything that is considered as a valid e-mail address for RFC-822 and friends, including many things that are clearly not addressable on the wide Internet:
>>> parseaddr(\'invalid@example,com\') # notice the comma
(\'\', \'invalid@example\')
>>> parseaddr(\'invalid-email\')
(\'\', \'invalid-email\')
So, as @TokenMacGuy put it, the only definitive way of checking an e-mail address is to send an e-mail to the expected address and wait for the user to act on the information inside the message.
However, you might want to check for, at least, the presence of an @-sign on the second tuple element, as @bvukelic suggests:
>>> \'@\' in parseaddr(\"invalid-email\")[1]
False
If you want to go a step further, you can install the dnspython project and resolve the mail servers for the e-mail domain (the part after the \'@\'), only trying to send an e-mail if there are actual MX
servers:
>>> from dns.resolver import query
>>> domain = \'foo@bar@google.com\'.rsplit(\'@\', 1)[-1]
>>> bool(query(domain, \'MX\'))
True
>>> query(\'example.com\', \'MX\')
Traceback (most recent call last):
File \"<stdin>\", line 1, in <module>
[...]
dns.resolver.NoAnswer
>>> query(\'not-a-domain\', \'MX\')
Traceback (most recent call last):
File \"<stdin>\", line 1, in <module>
[...]
dns.resolver.NXDOMAIN
You can catch both NoAnswer
and NXDOMAIN
by catching dns.exception.DNSException
.
And Yes, foo@bar@google.com
is a syntactically valid address. Only the last @
should be considered for detecting where the domain part starts.
回答3:
I haven\'t seen the answer already here among the mess of custom Regex answers, but...
Python has a module called validate_email which has 3 levels of email validation, including asking a valid SMTP server if the email address is valid (without sending an email).
Check email string is valid format:
from validate_email import validate_email
is_valid = validate_email(\'example@example.com\')
Check if the host has SMTP Server:
is_valid = validate_email(\'example@example.com\',check_mx=True)
Check if the host has SMTP Server and the email really exists:
is_valid = validate_email(\'example@example.com\',verify=True)
For those interested in the dirty details, validate_email.py (source) aims to be faithful to RFC 2822.
All we are really doing is comparing the input string to one gigantic regular expression. But building that regexp, and ensuring its correctness, is made much easier by assembling it from the \"tokens\" defined by the RFC. Each of these tokens is tested in the accompanying unit test file.
To install with pip
pip install validate_email
and you\'ll need the pyDNS module for checking SMTP servers
pip install pyDNS
or from Ubuntu
apt-get python3-dns
回答4:
Email addresses are not as simple as they seem! For example, Bob_O\'Reilly+tag@example.com, is a valid email address.
I\'ve had some luck with the lepl package (http://www.acooke.org/lepl/). It can validate email addresses as indicated in RFC 3696: http://www.faqs.org/rfcs/rfc3696.html
Found some old code:
import lepl.apps.rfc3696
email_validator = lepl.apps.rfc3696.Email()
if not email_validator(\"email@example.com\"):
print \"Invalid email\"
回答5:
I found an excellent (and tested) way to check for valid email address. I paste my code here:
# here i import the module that implements regular expressions
import re
# here is my function to check for valid email address
def test_email(your_pattern):
pattern = re.compile(your_pattern)
# here is an example list of email to check it at the end
emails = [\"john@example.com\", \"python-list@python.org\", \"wha.t.`1an?ug{}ly@email.com\"]
for email in emails:
if not re.match(pattern, email):
print \"You failed to match %s\" % (email)
elif not your_pattern:
print \"Forgot to enter a pattern!\"
else:
print \"Pass\"
# my pattern that is passed as argument in my function is here!
pattern = r\"\\\"?([-a-zA-Z0-9.`?{}]+@\\w+\\.\\w+)\\\"?\"
# here i test my function passing my pattern
test_email(pattern)
回答6:
I see a lot of complicated answers here. Some of them, fail to knowledge simple, true email address, or have false positives. Below, is the simplest way of testing that the string would be a valid email. It tests against 2 and 3 letter TLD\'s. Now that you technically can have larger ones, you may wish to increase the 3 to 4, 5 or even 10.
import re
def valid_email(email):
return bool(re.search(r\"^[\\w\\.\\+\\-]+\\@[\\w]+\\.[a-z]{2,3}$\", email))
回答7:
This is typically solved using regex. There are many variations of solutions however. Depending on how strict you need to be, and if you have custom requirements for validation, or will accept any valid email address.
See this page for reference: http://www.regular-expressions.info/email.html
回答8:
Email addresses are incredibly complicated. Here\'s a sample regex that will match every RFC822-valid address: http://www.ex-parrot.com/pdw/Mail-RFC822-Address.html
You\'ll notice that it\'s probably longer than the rest of your program. There are even whole modules for Perl with the purpose of validating email addresses. So you probably won\'t get anything that\'s 100% perfect as a regex while also being readable. Here\'s a sample recursive descent parser: http://cpansearch.perl.org/src/ABIGAIL/RFC-RFC822-Address-2009110702/lib/RFC/RFC822/Address.pm
but you\'ll need to decide whether you need perfect parsing or simple code.
回答9:
import re
def email():
email = raw_input(\"enter the mail address::\")
match = re.search(r\'[\\w.-]+@[\\w.-]+.\\w+\', email)
if match:
print \"valid email :::\", match.group()
else:
print \"not valid:::\"
email()
回答10:
If you want to take out the mail from a long string or file Then try this.
([^@|\\s]+@[^@]+\\.[^@|\\s]+)
Note, this will work when you have a space before and after your email-address. if you don\'t have space or have some special chars then you may try modifying it.
Working example:
string=\"Hello ABCD, here is my mail id example@me.com \"
res = re.search(\"([^@|\\s]+@[^@]+\\.[^@|\\s]+)\",string,re.I)
res.group(1)
This will take out example@me.com from this string.
Also, note this may not be the right answer.. But I have posted it here to help someone who have specific requirement like me
回答11:
Abovementioned parseaddr would ignore the trailing @.
from email.utils import parseaddr
parseaddr(\'aaa@bbb@ccc.com\') (\'\', \'aaa@bbb\')
Probably extract address and compare to the original?
Has anybody tried validate.email ?
回答12:
import validator
is_valid = validate_email(\'example@example.com\',verify=True)
if (is_valid==True):
return 1
else:
return 0
See validate_email docs.
回答13:
Finding Email-id:
import re
a=open(\"aa.txt\",\"r\")
#c=a.readlines()
b=a.read()
c=b.split(\"\\n\")
print(c)
for d in c:
obj=re.search(r\'[\\w.]+\\@[\\w.]+\',d)
if obj:
print(obj.group())
#for more calcification click on image above..
回答14:
For check of email use email_validator
from email_validator import validate_email, EmailNotValidError
def check_email(email):
try:
v = validate_email(email) # validate and get info
email = v[\"email\"] # replace with normalized form
print(\"True\")
except EmailNotValidError as e:
# email is not valid, exception message is human-readable
print(str(e))
check_email(\"test@gmailcom\")
回答15:
Found this to be a practical implementation:
[^@\\s]+@[^@\\s]+\\.[^@\\s]+
回答16:
\"^[\\w\\.\\+\\-]+\\@[\\w]+\\.[a-z]{2,3}$\"
回答17:
email validation
import re
def validate(email):
match=re.search(r\"(^[a-zA-Z0-9_.+-]+@[a-zA-Z0-9]+\\.[a-zA-Z0-9.]*\\.*[com|org|edu]{3}$)\",email)
if match:
return \'Valid email.\'
else:
return \'Invalid email.\'
回答18:
The only really accurate way of distinguishing real, valid email addresses from invalid ones is to send mail to it. What counts as an email is surprisingly convoluted (\"John Doe\" <john.doe@example.com>\"
actually is a valid email address), and you most likely want the email address to actually send mail to it later. After it passes some basic sanity checks (such as in Thomas\'s answer, has an @
and at least one .
after the @
), you should probably just send an email verification letter to the address, and wait for the user to follow a link embedded in the message to confirm that the email was valid.