Parsing Performance (If, TryParse, Try-Catch)

2020-01-25 01:36发布

问题:

I know plenty about the different ways of handling parsing text for information. For parsing integers for example, what kind of performance can be expected. I am wondering if anyone knows of any good stats on this. I am looking for some real numbers from someone who has tested this.

Which of these offers the best performance in which situations?

Parse(...)  // Crash if the case is extremely rare .0001%

If (SomethingIsValid) // Check the value before parsing
    Parse(...)

TryParse(...) // Using TryParse

try
{
    Parse(...)
}
catch
{
    // Catch any thrown exceptions
}

回答1:

Always use T.TryParse(string str, out T value). Throwing exceptions is expensive and should be avoided if you can handle the situation a priori. Using a try-catch block to "save" on performance (because your invalid data rate is low) is an abuse of exception handling at the expense of maintainability and good coding practices. Follow sound software engineering development practices, write your test cases, run your application, THEN benchmark and optimize.

"We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. Yet we should not pass up our opportunities in that critical 3%" -Donald Knuth

Therefore you assign, arbitrarily like in carbon credits, that the performance of try-catch is worse and that the performance of TryParse is better. Only after we've run our application and determined that we have some sort of slowdown w.r.t. string parsing would we even consider using anything other than TryParse.

(edit: since it appears the questioner wanted timing data to go with good advice, here is the timing data requested)

Times for various failure rates on 10,000 inputs from the user (for the unbelievers):

Failure Rate      Try-Catch          TryParse        Slowdown
  0%           00:00:00.0131758   00:00:00.0120421      0.1
 10%           00:00:00.1540251   00:00:00.0087699     16.6
 20%           00:00:00.2833266   00:00:00.0105229     25.9
 30%           00:00:00.4462866   00:00:00.0091487     47.8
 40%           00:00:00.6951060   00:00:00.0108980     62.8
 50%           00:00:00.7567745   00:00:00.0087065     85.9
 60%           00:00:00.7090449   00:00:00.0083365     84.1
 70%           00:00:00.8179365   00:00:00.0088809     91.1
 80%           00:00:00.9468898   00:00:00.0088562    105.9
 90%           00:00:01.0411393   00:00:00.0081040    127.5
100%           00:00:01.1488157   00:00:00.0078877    144.6


/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryCatch(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        try
        {
            value = Int32.Parse(input);
        }
        catch(FormatException)
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

/// <param name="errorRate">Rate of errors in user input</param>
/// <returns>Total time taken</returns>
public static TimeSpan TimeTryParse(double errorRate, int seed, int count)
{
    Stopwatch stopwatch = new Stopwatch();
    Random random = new Random(seed);
    string bad_prefix = @"X";

    stopwatch.Start();
    for(int ii = 0; ii < count; ++ii)
    {
        string input = random.Next().ToString();
        if (random.NextDouble() < errorRate)
        {
           input = bad_prefix + input;
        }

        int value = 0;
        if (!Int32.TryParse(input, out value))
        {
            value = -1; // we would do something here with a logger perhaps
        }
    }
    stopwatch.Stop();

    return stopwatch.Elapsed;
}

public static void TimeStringParse()
{
    double errorRate = 0.1; // 10% of the time our users mess up
    int count = 10000; // 10000 entries by a user

    TimeSpan trycatch = TimeTryCatch(errorRate, 1, count);
    TimeSpan tryparse = TimeTryParse(errorRate, 1, count);

    Console.WriteLine("trycatch: {0}", trycatch);
    Console.WriteLine("tryparse: {0}", tryparse);
}


回答2:

Although I haven't personally profiled the different ways, this chap has:

http://blogs.msdn.com/ianhu/archive/2005/12/19/505702.aspx



回答3:

Try-Catch will always be the slower. TryParse will be faster.

The IF and TryParse are the same.



回答4:

Here's another chap who's also profiled the performance differences, and does so with a few different data types to see how much (if at all) they inherently affect the performance: int, bool, and DateTime.



回答5:

Option 1: Will throw an exception on bad data.
Option 2: SomethingIsValid() could be quite expensive - particularly if you are pre-checking a string for Integer parsability.
Option 3: I like this.  You need a null check afterwards, but it's pretty cheap.
Option 4 is definitely the worst.

Exception handling is comparatively expensive, so avoid it if you can.

In particular, bad inputs are to be expected, not exceptional, so you shouldn't use them for this situation.

(Although, before TryParse, It may have been the best option.)



标签: c# parsing text