c# Fastest way to remove extra white spaces

2019-01-13 17:34发布

What is the fastest way to replace extra white spaces to one white space?
e.g.

from

foo      bar 

to

foo bar

23条回答
虎瘦雄心在
2楼-- · 2019-01-13 18:24

The fastest way? Iterate over the string and build a second copy in a StringBuilder character by character, only copying one space for each group of spaces.

The easier to type Replace variants will create a bucket load of extra strings (or waste time building the regex DFA).

Edit with comparison results:

Using http://ideone.com/h6pw3, with n=50 (had to reduce it on ideone because it took so long they had to kill my process), I get:

Regex: 7771ms.

Stringbuilder: 894ms.

Which is indeed as expected, Regex is horribly inefficient for something this simple.

查看更多
Summer. ? 凉城
3楼-- · 2019-01-13 18:24

This is funny, but on my PC the below method is just as fast as Sergey Povalyaev's StringBulder approach - (~282ms for 1000 reps, 10k src strings). Not sure about memory usage though.

string RemoveExtraWhiteSpace(string src, char[] wsChars){
   return string.Join(" ",src.Split(wsChars, StringSplitOptions.RemoveEmptyEntries));
}

Obviously it works okay with any chars - not just spaces.

Though this is not what the OP asked for - but if what you really need is to replace specific consecutive characters in a string with only one instance you can use this relatively efficient method:

    string RemoveDuplicateChars(string src, char[] dupes){  
        var sd = (char[])dupes.Clone();  
        Array.Sort(sd);

        var res = new StringBuilder(src.Length);

        for(int i = 0; i<src.Length; i++){
            if( i==0 || src[i]!=src[i-1] || Array.BinarySearch(sd,src[i])<0){
                res.Append(src[i]); 
            }
        }
        return res.ToString();
    }
查看更多
We Are One
4楼-- · 2019-01-13 18:25

I use below methods - they handle all whitespace chars not only spaces, trim both leading and trailing whitespaces, remove extra whitespaces, and all whitespaces are replaced to space char (so we have uniform space separator). And these methods are fast.

public static String CompactWhitespaces( String s )
{
    StringBuilder sb = new StringBuilder( s );

    CompactWhitespaces( sb );

    return sb.ToString();
}

public static void CompactWhitespaces( StringBuilder sb )
{
    if( sb.Length == 0 )
        return;

    // set [start] to first not-whitespace char or to sb.Length

    int start = 0;

    while( start < sb.Length )
    {
        if( Char.IsWhiteSpace( sb[ start ] ) )
            start++;
        else 
            break;
    }

    // if [sb] has only whitespaces, then return empty string

    if( start == sb.Length )
    {
        sb.Length = 0;
        return;
    }

    // set [end] to last not-whitespace char

    int end = sb.Length - 1;

    while( end >= 0 )
    {
        if( Char.IsWhiteSpace( sb[ end ] ) )
            end--;
        else 
            break;
    }

    // compact string

    int dest = 0;
    bool previousIsWhitespace = false;

    for( int i = start; i <= end; i++ )
    {
        if( Char.IsWhiteSpace( sb[ i ] ) )
        {
            if( !previousIsWhitespace )
            {
                previousIsWhitespace = true;
                sb[ dest ] = ' ';
                dest++;
            }
        }
        else
        {
            previousIsWhitespace = false;
            sb[ dest ] = sb[ i ];
            dest++;
        }
    }

    sb.Length = dest;
}
查看更多
我想做一个坏孩纸
5楼-- · 2019-01-13 18:25

I just whipped this up, haven't tested it yet though. But I felt this was elegant, and avoids regex:

    /// <summary>
    /// Removes extra white space.
    /// </summary>
    /// <param name="s">
    /// The string
    /// </param>
    /// <returns>
    /// The string, with only single white-space groupings. 
    /// </returns>
    public static string RemoveExtraWhiteSpace(this string s)
    {
        if (s.Length == 0)
        {
            return string.Empty;
        }

        var stringBuilder = new StringBuilder();
        var whiteSpaceCount = 0;
        foreach (var character in s)
        {
            if (char.IsWhiteSpace(character))
            {
                whiteSpaceCount++;
            }
            else
            {
                whiteSpaceCount = 0;
            }

            if (whiteSpaceCount > 1)
            {
                continue;
            }

            stringBuilder.Append(character);
        }

        return stringBuilder.ToString();
    }
查看更多
劳资没心,怎么记你
6楼-- · 2019-01-13 18:28
string q = " Hello     how are   you           doing?";
string a = String.Join(" ", q.Split(new string[] { " " }, StringSplitOptions.RemoveEmptyEntries));
查看更多
登录 后发表回答