I'm looking for a regular expression that removes illegal characters. But I don't know what the characters will be.
For example:
In a process, I want my string to match ([a-zA-Z0-9/-]*)
. So I would like to replace all characters that don't match the regexp above.
That would be:
[^a-zA-Z0-9/-]+
[^ ]
at the start of a character class negates it - it matches characters not in the class.
See also: Character Classes
Thanks to Kobi's answer I've created a helper method to strips unaccepted characters .
The allowed pattern should be in Regex format, expect them wrapped in square brackets. A function will insert a tilde after opening squere bracket.
I anticipate that it could work not for all RegEx describing valid characters sets,but it works for relatively simple sets, that we are using.
/// <summary>
/// Replaces not expected characters.
/// </summary>
/// <param name="text"> The text.</param>
/// <param name="allowedPattern"> The allowed pattern in Regex format, expect them wrapped in brackets</param>
/// <param name="replacement"> The replacement.</param>
/// <returns></returns>
/// // https://stackoverflow.com/questions/4460290/replace-chars-if-not-match.
//https://stackoverflow.com/questions/6154426/replace-remove-characters-that-do-not-match-the-regular-expression-net
//[^ ] at the start of a character class negates it - it matches characters not in the class.
//Replace/Remove characters that do not match the Regular Expression
static public string ReplaceNotExpectedCharacters( this string text, string allowedPattern,string replacement )
{
allowedPattern = allowedPattern.StripBrackets( "[", "]" );
//[^ ] at the start of a character class negates it - it matches characters not in the class.
var result = Regex.Replace(text, @"[^" + allowedPattern + "]", replacement);
return result; //returns result free of negated chars
}