Regex with exception of particular words

2020-03-19 07:14发布

问题:

I have problem with regex. I need to make regex with an exception of a set of specified words, for example: apple, orange, juice. and given these words, it will match everything except those words above.

applejuice (match)
yummyjuice (match)
yummy-apple-juice (match)
orangeapplejuice (match)
orange-apple-juice (match)
apple-orange-aple (match)
juice-juice-juice (match)
orange-juice (match)

apple (should not match)
orange (should not match)
juice (should not match)

回答1:

If you really want to do this with a single regular expression, you might find lookaround helpful (especially negative lookahead in this example). Regex written for Ruby (some implementations have different syntax for lookarounds):

rx = /^(?!apple$|orange$|juice$)/


回答2:

I noticed that apple-juice should match according to your parameters, but what about apple juice? I'm assuming that if you are validating apple juice you still want it to fail.

So - lets build a set of characters that count as a "boundary":

/[^-a-z0-9A-Z_]/        // Will match any character that is <NOT> - _ or 
                        // between a-z 0-9 A-Z 

/(?:^|[^-a-z0-9A-Z_])/  // Matches the beginning of the string, or one of those 
                        // non-word characters.

/(?:[^-a-z0-9A-Z_]|$)/  // Matches a non-word or the end of string

/(?:^|[^-a-z0-9A-Z_])(apple|orange|juice)(?:[^-a-z0-9A-Z_]|$)/ 
   // This should >match< apple/orange/juice ONLY when not preceded/followed by another
   // 'non-word' character just negate the result of the test to obtain your desired
   // result.

In most regexp flavors \b counts as a "word boundary" but the standard list of "word characters" doesn't include - so you need to create a custom one. It could match with /\b(apple|orange|juice)\b/ if you weren't trying to catch - as well...

If you are only testing 'single word' tests you can go with a much simpler:

/^(apple|orange|juice)$/ // and take the negation of this...


回答3:

This gets some of the way there:

((?:apple|orange|juice)\S)|(\S(?:apple|orange|juice))|(\S(?:apple|orange|juice)\S)


回答4:

\A(?!apple\Z|juice\Z|orange\Z).*\Z

will match an entire string unless it only consists of one of the forbidden words.

Alternatively, if you're not using Ruby or you're sure that your strings contain no line breaks or you have set the option that ^ and $ do not match on beginnings/ends of lines

^(?!apple$|juice$|orange$).*$

will also work.



回答5:

Something like (PHP)

$input = "The orange apple gave juice";
if(preg_match("your regex for validating") && !preg_match("/apple|orange|juice/", $input))
{
  // it's ok;
}
else
{
  //throw validation error
}