In my ASP.net project I need to validate some basic data types for user inputs. The data types are like numeric, decimal, datetime etc.
What is the best approach that I should have taken in terms of performance? Is it to do it by Regex.IsMatch()
or by TryParse()
?
Thanks in advance.
As other would say, the best way to answer that is to measure it ;)
static void Main(string[] args)
{
List<double> meansFailedTryParse = new List<double>();
List<double> meansFailedRegEx = new List<double>();
List<double> meansSuccessTryParse = new List<double>();
List<double> meansSuccessRegEx = new List<double>();
for (int i = 0; i < 1000; i++)
{
string input = "123abc";
int res;
bool res2;
var sw = Stopwatch.StartNew();
res2 = Int32.TryParse(input, out res);
sw.Stop();
meansFailedTryParse.Add(sw.Elapsed.TotalMilliseconds);
//Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);
sw = Stopwatch.StartNew();
res2 = Regex.IsMatch(input, @"^[0-9]*$");
sw.Stop();
meansFailedRegEx.Add(sw.Elapsed.TotalMilliseconds);
//Console.WriteLine("Result of " + res2 + " Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);
input = "123";
sw = Stopwatch.StartNew();
res2 = Int32.TryParse(input, out res);
sw.Stop();
meansSuccessTryParse.Add(sw.Elapsed.TotalMilliseconds);
//Console.WriteLine("Result of " + res2 + " try parse :" + sw.Elapsed.TotalMilliseconds);
sw = Stopwatch.StartNew();
res2 = Regex.IsMatch(input, @"^[0-9]*$");
sw.Stop();
meansSuccessRegEx.Add(sw.Elapsed.TotalMilliseconds);
//Console.WriteLine("Result of " + res2 + " Regex.IsMatch :" + sw.Elapsed.TotalMilliseconds);
}
Console.WriteLine("Failed TryParse mean execution time " + meansFailedTryParse.Average());
Console.WriteLine("Failed Regex mean execution time " + meansFailedRegEx.Average());
Console.WriteLine("successful TryParse mean execution time " + meansSuccessTryParse.Average());
Console.WriteLine("successful Regex mean execution time " + meansSuccessRegEx.Average());
}
}
TryParse
and Regex.IsMatch
are used for two fundamentally different things. Regex.IsMatch
tells you if the string in question matches some particular pattern. It returns a yes/no answer. TryParse
actually converts the value if possible, and tells you whether it succeeded.
Unless you're very careful in crafting the regular expression, Regex.IsMatch
can return true
when TryParse
will return false
. For example, consider the simple case of parsing a byte
. With TryParse
you have:
byte b;
bool isGood = byte.TryParse(myString, out b);
If the value in myString
is between 0 and 255, TryParse
will return true
.
Now, let's try with Regex.IsMatch
. Let's see, what should that regular expression be? We can't just say @"\d+"
or even @\d{1,3}"
. Specifying the format becomes a very difficult job. You have to handle leading 0s, leading and trailing white space, and allow 255
but not 256
.
And that's just for parsing a 3-digit number. The rules get even more complicated when you're parsing an int
or long
.
Regular expressions are great for determining form. They suck when it comes to determining value. Since our standard data types all have limits, determining its value is part of figuring out whether or not the number is valid.
You're better off using TryParse
whenever possible, if only to save yourself the headache of trying to come up with a reliable regular expression that will do the validation. It's likely (I'd say almost certain) that a particular TryParse
for any of the native types will execute faster than the equivalent regular expression.
The above said, I've probably spent more time on this answer than your Web page will spend executing your TryParse
or Regex.IsMatch
--total throughout its entire life. The time to execute these things is so small in the context of everything else your Web site is doing, any time you spend pondering the problem is wasted.
Use TryParse
if you can, because it's easier. Otherwise use Regex
.
Don't try to make regexes do everything.
Sometimes a simple regex will get you 90% of the way and to make it do everything you need the complexity grows ten times or more.
Then I often find that the simplest solution is to use the regex to check the form and then rely on good old code for the value checking.
Take a date for example, use a regex to check for a match on a date format and then use capturing groups to check the values of the individual values.
I'd guess TryParse is quicker, but more importantly, it's more expressive.
The regular expressions can get pretty ugly when you consider all the valid values for each data type you're using. For example, with DateTime you have to ensure the month is between 1 and 12, and that the day is within the valid range for that particular month.