Regex vs Tryparse what is the best in performance

2019-02-21 21:11发布

问题:

In my ASP.net project I need to validate some basic data types for user inputs. The data types are like numeric, decimal, datetime etc.

What is the best approach that I should have taken in terms of performance? Is it to do it by Regex.IsMatch() or by TryParse()?

Thanks in advance.

回答1:

As other would say, the best way to answer that is to measure it ;)

    static void Main(string[] args)
    {

        List<double> meansFailedTryParse = new List<double>();
        List<double> meansFailedRegEx = new List<double>();
        List<double> meansSuccessTryParse = new List<double>();
        List<double> meansSuccessRegEx = new List<double>();


        for (int i = 0; i < 1000; i++)
        {


            string input = "123abc";

            int res;
            bool res2;
            var sw = Stopwatch.StartNew();
            res2 = Int32.TryParse(input, out res);
            sw.Stop();
            meansFailedTryParse.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);

            sw = Stopwatch.StartNew();
            res2 = Regex.IsMatch(input, @"^[0-9]*$");
            sw.Stop();
            meansFailedRegEx.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + "  Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);

            input = "123";
            sw = Stopwatch.StartNew();
            res2 = Int32.TryParse(input, out res);
            sw.Stop();
            meansSuccessTryParse.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);


            sw = Stopwatch.StartNew();
            res2 = Regex.IsMatch(input, @"^[0-9]*$");
            sw.Stop();
            meansSuccessRegEx.Add(sw.Elapsed.TotalMilliseconds);
            //Console.WriteLine("Result of " + res2 + "  Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);
        }

        Console.WriteLine("Failed TryParse mean execution time     " + meansFailedTryParse.Average());
        Console.WriteLine("Failed Regex mean execution time        " + meansFailedRegEx.Average());

        Console.WriteLine("successful TryParse mean execution time " + meansSuccessTryParse.Average());
        Console.WriteLine("successful Regex mean execution time    " + meansSuccessRegEx.Average());
    }
}


回答2:

TryParse and Regex.IsMatch are used for two fundamentally different things. Regex.IsMatch tells you if the string in question matches some particular pattern. It returns a yes/no answer. TryParse actually converts the value if possible, and tells you whether it succeeded.

Unless you're very careful in crafting the regular expression, Regex.IsMatch can return true when TryParse will return false. For example, consider the simple case of parsing a byte. With TryParse you have:

byte b;
bool isGood = byte.TryParse(myString, out b);

If the value in myString is between 0 and 255, TryParse will return true.

Now, let's try with Regex.IsMatch. Let's see, what should that regular expression be? We can't just say @"\d+" or even @\d{1,3}". Specifying the format becomes a very difficult job. You have to handle leading 0s, leading and trailing white space, and allow 255 but not 256.

And that's just for parsing a 3-digit number. The rules get even more complicated when you're parsing an int or long.

Regular expressions are great for determining form. They suck when it comes to determining value. Since our standard data types all have limits, determining its value is part of figuring out whether or not the number is valid.

You're better off using TryParse whenever possible, if only to save yourself the headache of trying to come up with a reliable regular expression that will do the validation. It's likely (I'd say almost certain) that a particular TryParse for any of the native types will execute faster than the equivalent regular expression.

The above said, I've probably spent more time on this answer than your Web page will spend executing your TryParse or Regex.IsMatch--total throughout its entire life. The time to execute these things is so small in the context of everything else your Web site is doing, any time you spend pondering the problem is wasted.

Use TryParse if you can, because it's easier. Otherwise use Regex.



回答3:

Don't try to make regexes do everything.

Sometimes a simple regex will get you 90% of the way and to make it do everything you need the complexity grows ten times or more.

Then I often find that the simplest solution is to use the regex to check the form and then rely on good old code for the value checking.

Take a date for example, use a regex to check for a match on a date format and then use capturing groups to check the values of the individual values.



回答4:

I'd guess TryParse is quicker, but more importantly, it's more expressive.

The regular expressions can get pretty ugly when you consider all the valid values for each data type you're using. For example, with DateTime you have to ensure the month is between 1 and 12, and that the day is within the valid range for that particular month.