Regex: allow everything but some selected characte

2020-07-11 11:39发布

问题:

I would like to validate a textarea and I just don't get regex (It took me the day and a bunch of tutorials to figure it out).

Basically I would like to be able to allow everything (line breaks and chariots included), but the characters that could be malicious( those which would lead to a security breach). As there are very few characters that are not allowed, I assume that it would make more sense to create a black list than a white one.

My question is: what is the standard "everything but" in Regex?

I'm using javascript and jquery.

I tried this but it doesn't work (it's awful, I know..):

var messageReg = /^[a-zA-Z0-9éèêëùüàâöïç\"\/\%\(\).'?!,@$#§-_ \n\r]+$/;

Thank you.

回答1:

As Esailija mentioned, this won't do anything for real security.

The code you mentioned is almost a negated set, as murgatroid99 mentioned, the ^ goes inside the brackets. So the regular expression will match anything that is not in that list. But it looks like you really want to strip out those characters, so your regexp doesn't need to be negated.

Your code should look like:

str.replace(/[a-zA-Z0-9éèêëùüàâöïç\"\/\%\(\).'?!,@$#-_ \n\r]/g, "");

That says, remove all the characters in my regular expression.

However, that is saying you don't want to keep a-zA-Z0-9 are you sure you want to strip those out?

Also, chrome doesn't like § in Regular Expressions, you have to use the \x along with the hex code for the character



回答2:

If you want to exclude a set of characters (some punctuation characters, for example) you would use the ^ operator at the beginning of a character set, in a regex like

/[^.?!]/

This matches any character that is not ., ?, or !.



回答3:

You can use the ^ as the first character inside brackets [] to negate what's in it:

/^[^abc]*$/

This means: "from start to finish, no a, b, or c."