I've written a regular expression in PHP to allow strings that are alpha-numeric with any punctuation except & or @. Essentially, I need to allow anything on a standard American keyboard with the exception of those two characters. It took me a while to come up with the following regex, which seems to be doing what I need:
if (ereg("[^]A-Za-z0-9\[!\"#$%'()*+,./:;<=>?^_`{|}~\-]", $test_string)) {
// error message goes here
}
Which brings me to my question... is there a better, simpler, or more efficient way?
Have a look at character ranges:
@[!-%'-?A-~]+@
This will exclude the characters
& (\0x26)
and@ (0x40)
. Looking at an ASCII Table,you can see how this works: The exclamation mark is the first character in the ASCII set, that is not whitespace. It will then match everything up to and including the%
character, which immediately precedes the ampersand. Then the next range until the@
character, which lies between?
andA
. After that, we match everything unto the end of the standard ASCII character set which is a~
.Update
To make things more readable, you might also consider to do this in two steps: At first, filter anything outside of the default ASCII range.
@[!-~]+@
In a second step, filter your undesired characters, or simply do a
str_pos
on the characters.At the end, you can compare it with what you started to see whether it contained any undesired characters.
Instead, you could also use a regex such as this for the second step.
/[^@&]+/
The steps are interchangeable and doing a str_pos on
@
or&
as a first step, to identify bad characters, may be better performance wise.I think rather than testing for all the alpha numeric characters you can simply check for @ and & and use a not?
What about this:
with
preg_match