C# Remove special characters

2019-03-18 04:48发布

问题:

I want to remove all special characters from a string. Allowed characters are A-Z (uppercase or lowercase), numbers (0-9), underscore (_), white space ( ), pecentage(%) or the dot sign (.).

I have tried this:

        StringBuilder sb = new StringBuilder();
        foreach (char c in input)
        {
            if ((c >= '0' && c <= '9') || (c >= 'A' && c <= 'Z') || (c >= 'a' && c <= 'z') | c == '.' || c == '_' || c == ' ' || c == '%')
            { sb.Append(c); }
        }
        return sb.ToString();

And this:

        Regex r = new Regex("(?:[^a-z0-9% ]|(?<=['\"])s)", RegexOptions.IgnoreCase | RegexOptions.CultureInvariant | RegexOptions.Compiled); 
        return r.Replace(input, String.Empty); 

But nothing seems to be working. Any help will be appreciated.

Thank you!

回答1:

You can simplify the first method to

StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
    if (Char.IsLetterOrDigit(c) || c == '.' || c == '_' || c == ' ' || c == '%')
    { sb.Append(c); }
}
return sb.ToString();

which seems to pass simple tests. You can shorten it using LINQ

return new string(
    input.Where(
        c => Char.IsLetterOrDigit(c) || 
            c == '.' || c == '_' || c == ' ' || c == '%')
    .ToArray());


回答2:

Regex.Replace(input, "[^a-zA-Z0-9% ._]", string.Empty)


回答3:

The first approach seems correct, except that you have a | (bitwise OR) instead of a || before c == '.'.

By the way, you should state what doesn't work (doesn't it compile, or does it crash, or does it produce wrong output?)



回答4:

StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
    if (char.IsLetterOrDigit(c) || "_ %.".Contains(c.ToString()))
        sb.Append(c);
}
return sb.ToString();


回答5:

This is how my version might look.

StringBuilder sb = new StringBuilder();
foreach (char c in input)
{
    if (Char.IsLetterOrDigit(c) ||
        c == '.' || c == '_' || c == ' ' || c == '%')
        sb.Append(c);
    }
}
return sb.ToString();


回答6:

Cast each char to an int, then compare its ascii code to the ascii table, which you can find all over the internet: http://www.asciitable.com/

    {
        char[] input = txtInput.Text.ToCharArray();
        StringBuilder sbResult = new StringBuilder();

        foreach (char c in input)
        {
            int asciiCode = (int)c;
            if (
                //Space
                asciiCode == 32
                ||
                // Period (.)
                asciiCode == 46
                ||
                // Percentage Sign (%)
                asciiCode == 37
                ||
                // Underscore
                asciiCode == 95
                ||
                ( //0-9, 
                    asciiCode >= 48
                    && asciiCode <= 57
                )
                ||
                ( //A-Z
                    asciiCode >= 65
                    && asciiCode <= 90
                )
                ||
                ( //a-z
                    asciiCode >= 97
                    && asciiCode <= 122
                )
            )
            {
                sbResult.Append(c);
            }
        }

        txtResult.Text = sbResult.ToString();
    }


回答7:

private string RemoveReservedCharacters(string strValue)
{
    char[] ReservedChars = {'/', ':','*','?','"', '<', '>', '|'};

    foreach (char strChar in ReservedChars)
    {
        strValue = strValue.Replace(strChar.ToString(), "");
    }
    return strValue;
}