The fastest way? Iterate over the string and build a second copy in a StringBuilder character by character, only copying one space for each group of spaces.
The easier to type Replace variants will create a bucket load of extra strings (or waste time building the regex DFA).
Edit with comparison results:
Using http://ideone.com/h6pw3, with n=50 (had to reduce it on ideone because it took so long they had to kill my process), I get:
Regex: 7771ms.
Stringbuilder: 894ms.
Which is indeed as expected, Regex is horribly inefficient for something this simple.
This is funny, but on my PC the below method is just as fast as Sergey Povalyaev's StringBulder approach - (~282ms for 1000 reps, 10k src strings). Not sure about memory usage though.
Obviously it works okay with any chars - not just spaces.
Though this is not what the OP asked for - but if what you really need is to replace specific consecutive characters in a string with only one instance you can use this relatively efficient method:
string RemoveDuplicateChars(string src, char[] dupes){
var sd = (char[])dupes.Clone();
Array.Sort(sd);
var res = new StringBuilder(src.Length);
for(int i = 0; i<src.Length; i++){
if( i==0 || src[i]!=src[i-1] || Array.BinarySearch(sd,src[i])<0){
res.Append(src[i]);
}
}
return res.ToString();
}
I use below methods - they handle all whitespace chars not only spaces, trim both leading and trailing whitespaces, remove extra whitespaces, and all whitespaces are replaced to space char (so we have uniform space separator). And these methods are fast.
public static String CompactWhitespaces( String s )
{
StringBuilder sb = new StringBuilder( s );
CompactWhitespaces( sb );
return sb.ToString();
}
public static void CompactWhitespaces( StringBuilder sb )
{
if( sb.Length == 0 )
return;
// set [start] to first not-whitespace char or to sb.Length
int start = 0;
while( start < sb.Length )
{
if( Char.IsWhiteSpace( sb[ start ] ) )
start++;
else
break;
}
// if [sb] has only whitespaces, then return empty string
if( start == sb.Length )
{
sb.Length = 0;
return;
}
// set [end] to last not-whitespace char
int end = sb.Length - 1;
while( end >= 0 )
{
if( Char.IsWhiteSpace( sb[ end ] ) )
end--;
else
break;
}
// compact string
int dest = 0;
bool previousIsWhitespace = false;
for( int i = start; i <= end; i++ )
{
if( Char.IsWhiteSpace( sb[ i ] ) )
{
if( !previousIsWhitespace )
{
previousIsWhitespace = true;
sb[ dest ] = ' ';
dest++;
}
}
else
{
previousIsWhitespace = false;
sb[ dest ] = sb[ i ];
dest++;
}
}
sb.Length = dest;
}
The fastest way? Iterate over the string and build a second copy in a
StringBuilder
character by character, only copying one space for each group of spaces.The easier to type
Replace
variants will create a bucket load of extra strings (or waste time building the regex DFA).Edit with comparison results:
Using http://ideone.com/h6pw3, with n=50 (had to reduce it on ideone because it took so long they had to kill my process), I get:
Which is indeed as expected,
Regex
is horribly inefficient for something this simple.This is funny, but on my PC the below method is just as fast as Sergey Povalyaev's StringBulder approach - (~282ms for 1000 reps, 10k src strings). Not sure about memory usage though.
Obviously it works okay with any chars - not just spaces.
Though this is not what the OP asked for - but if what you really need is to replace specific consecutive characters in a string with only one instance you can use this relatively efficient method:
I use below methods - they handle all whitespace chars not only spaces, trim both leading and trailing whitespaces, remove extra whitespaces, and all whitespaces are replaced to space char (so we have uniform space separator). And these methods are fast.
I just whipped this up, haven't tested it yet though. But I felt this was elegant, and avoids regex: