Validating an email address with sscanf() format s

2019-08-09 03:56发布

问题:

This may be somewhat of a "fix-my-code" question, but I've looked at documentation, examples, and dozens, of, related, questions, and though I logically understand more or less how it all works, I am having trouble translating it into a C sscanf() format code. I am still relatively new to C, and am just starting to get into slightly beyond-simplistic stuff, and I am having trouble figuring out more complex format specifiers (ie. %[^...], etc.).

Anyways, here's what I have:

char user[EMAIL_LEN];
char site[EMAIL_LEN];
char domain[4];
if(sscanf(input, "%s@%s.%3s", user, site, domain) != 3){
  printf("--ERROR: Invalid email address.--\n");
}

Why doesn't that work? I'm just trying to get a simple aaaa@bbbb.ccc format, but for some reason sscanf(input, "%s@%s.%3s", user, site, domain) always evaluates to 1. Do I need to use some crazy %[^...] magic for it to convert correctly? I've been messing with %[^@] and that kind of thing, but I can't seem to make it work.

Any and all help is appreciated. Thanks!

回答1:

%s in a scanf format skips leading whitespace, then matches all non-whitespace characters up to and not including the next whitespace charater. So when you feed it your email address, then ENTIRE address gets copied into user to match the %s. Then, as the next character is not @, nothing more is matched and scanf returns 1.

You can try using something like:

sscanf(input, "%[^@ \t\n]@%[^. \t\n].%3[^ \t\n]", user, site, domain)

this will match everything up to a @ or whitespace as the user, then, if the next character is in fact a an @ will skip it and store everything up to . or whitespace in site. But this will accept lots of other characters that are not valid in an email address, and won't accept longer domain names. Better might be something like:

sscanf(input, "%[_a-zA-Z0-9.]@%[_a-zA-Z0-9.]", user, domain)

which will accept any string of letters, digits, underscore and period for both the name and domain. Then, if you really need to split off the last part of the domain, do that separately.