Most efficient way to determine if a string length

2020-07-05 04:53发布

问题:

I'm trying to speed up the following:

string s; //--> s is never null

if (s.Length != 0)
{
   <do something>
}

Problem is, it appears the .Length actually counts the characters in the string, and this is way more work than I need. Anybody have an idea on how to speed this up?

Or, is there a way to determine if s[0] exists, w/out checking the rest of the string?

回答1:

EDIT: Now that you've provided some more context:

  • Trying to reproduce this, I failed to find a bottleneck in string.Length at all. The only way of making it faster was to comment out both the test and the body of the if block - which isn't really fair. Just commenting out the condition slowed things down, i.e. unconditionally copying the reference was slower than checking the condition.

  • As has been pointed out, using the overload of string.Split which removes empty entries for you is the real killer optimization.

  • You can go further, by avoiding creating a new char array with just a space in every time. You're always going to pass the same thing effectively, so why not take advantage of that?

  • Empty arrays are effectively immutable. You can optimize the null/empty case by always returning the same thing.

The optimized code becomes:

private static readonly char[] Delimiters = " ".ToCharArray();
private static readonly string[] EmptyArray = new string[0];

public static string[] SplitOnMultiSpaces(string text)
{
    if (string.IsNullOrEmpty(text))
    {
        return EmptyArray;
    }

    return text.Split(Delimiters, StringSplitOptions.RemoveEmptyEntries);
}

String.Length absolutely does not count the letters in the string. The value is stored as a field - although I seem to remember that the top bit of that field is used to remember whether or not all characters are ASCII (or used to be, anyway) to enable other optimisations. So the property access may need to do a bitmask, but it'll still be O(1) and I'd expect the JIT to inline it, too. (It's implemented as an extern, but hopefully that wouldn't affect the JIT in this case - I suspect it's a common enough operation to potentially have special support.)

If you already know that the string isn't null, then your existing test of

if (s.Length != 0)

is the best way to go if you're looking for raw performance IMO. Personally in most cases I'd write:

if (s != "")

to make it clearer that we're not so much interested in the length as a value as whether or not this is the empty string. That will be slightly slower than the length test, but I believe it's clearer. As ever, I'd go for the clearest code until you have benchmark/profiling data to indicate that this really is a bottleneck. I know your question is explicitly about finding the most efficient test, but I thought I'd mention this anyway. Do you have evidence that this is a bottleneck?

EDIT: Just to give clearer reasons for my suggestion of not using string.IsNullOrEmpty: a call to that method suggests to me that the caller is explicitly trying to deal with the case where the variable is null, otherwise they wouldn't have mentioned it. If at this point of the code it counts as a bug if the variable is null, then you shouldn't be trying to handle it as a normal case.

In this situation, the Length check is actually better in one way than the inequality test I've suggested: it acts as an implicit assertion that the variable isn't null. If you have a bug and it is null, the test will throw an exception and the bug will be detected early. If you use the equality test it will treat null as being different to the empty string, so it will go into your "if" statement's body. If you use string.IsNullOrEmpty it will treat null as being the same as empty, so it won't go into the block.



回答2:

String.IsNullOrEmpty is the preferred method for checking for null or zero length strings.

Internally, it will use Length. The Length property for a string should not be calculated on the fly though.

If you're absolutely certain that the string will never be null and you have some strong objection to String.IsNullOrEmpty, the most efficient code I can think of would be:

if(s.Length > 0)
{
    // Do Something
}

Or, possibly even better:

if(s != "")
{
    // Do Something
}


回答3:

Accessing the Length property shouldn't do a count -- .NET strings store a count inside the object.

The SSCLI/Rotor source code contains an interesting comment which suggests that String.Length is (a) efficient and (b) magic:

// Gets the length of this string
//
/// This is a EE implemented function so that the JIT can recognise is specially
/// and eliminate checks on character fetchs in a loop like:
/// for(int I = 0; I < str.Length; i++) str[i]
/// The actually code generated for this will be one instruction and will be inlined.
//
public extern int Length {
    [MethodImplAttribute(MethodImplOptions.InternalCall)]
    get;
}


回答4:

Here is the function String.IsNullOrEmpty -

if (!String.IsNullOrEmpty(yourstring))
{
  // your code
}


回答5:

String.IsNullOrWhiteSpace(s);

true if s is null or Empty, or if s consists exclusively of white-space characters.



回答6:

As always with performace: benchmark.
Using C# 3.5 or before, you'll want to test yourString.Length vs String.IsNullOrEmpty(yourString)

using C# 4, do both of the above and add String.IsNullOrWhiteSpace(yourString)

Of course, if you know your string will never be empty, you could just attempt to access s[0] and handle the exception when it's not there. That's not normally good practice, but it may be closer to what you need (if s should always have a non-blank value).



回答7:

        for (int i = 0; i < 100; i++)
        {
            System.Diagnostics.Stopwatch timer = new System.Diagnostics.Stopwatch();
            string s = "dsfasdfsdafasd";

            timer.Start();
            if (s.Length > 0)
            {
            }

            timer.Stop();
            System.Diagnostics.Debug.Write(String.Format("s.Length != 0 {0} ticks       ", timer.ElapsedTicks));

            timer.Reset();
            timer.Start();
            if (s == String.Empty)
            {
            }

            timer.Stop();
            System.Diagnostics.Debug.WriteLine(String.Format("s== String.Empty {0} ticks", timer.ElapsedTicks));
        }

Using the stopwatch the s.length != 0 takes less ticks then s == String.Empty

after I fix the code



回答8:

Based on your intent described in your answer, why don't you just try using this built-in option on Split:

s.Split(new[]{" "}, StringSplitOptions.RemoveEmptyEntries);


回答9:

Just use String.Split(new char[]{' '}, StringSplitOptions.RemoveEmptyEntries) and it will do it all for you.