How would you check if a String was a number before parsing it?
相关问题
- Delete Messages from a Topic in Apache Kafka
- Jackson Deserialization not calling deserialize on
- How to maintain order of key-value in DataFrame sa
- StackExchange API - Deserialize Date in JSON Respo
- Difference between Types.INTEGER and Types.NULL in
As @CraigTP had mentioned in his excellent answer, I also have similar performance concerns on using Exceptions to test whether the string is numerical or not. So I end up splitting the string and use
java.lang.Character.isDigit()
.According to the Javadoc,
Character.isDigit(char)
will correctly recognizes non-Latin digits. Performance-wise, I think a simple N number of comparisons where N is the number of characters in the string would be more computationally efficient than doing a regex matching.UPDATE: As pointed by Jean-François Corbett in the comment, the above code would only validate positive integers, which covers the majority of my use case. Below is the updated code that correctly validates decimal numbers according to the default locale used in your system, with the assumption that decimal separator only occur once in the string.
Why is everyone pushing for exception/regex solutions?
While I can understand most people are fine with using try/catch, if you want to do it frequently... it can be extremely taxing.
What I did here was take the regex, the parseNumber() methods, and the array searching method to see which was the most efficient. This time, I only looked at integer numbers.
The results in speed I got were:
Disclaimer: I'm not claiming these methods are 100% optimized, they're just for demonstration of the data
Exceptions won if and only if the number is 4 characters or less, and every string is always a number... in which case, why even have a check?
In short, it is extremely painful if you run into invalid numbers frequently with the try/catch, which makes sense. An important rule I always follow is NEVER use try/catch for program flow. This is an example why.
Interestingly, the simple if char <0 || >9 was extremely simple to write, easy to remember (and should work in multiple languages) and wins almost all the test scenarios.
The only downside is that I'm guessing Integer.parseInt() might handle non ASCII numbers, whereas the array searching method does not.
For those wondering why I said it's easy to remember the character array one, if you know there's no negative signs, you can easily get away with something condensed as this:
Lastly as a final note, I was curious about the assigment operator in the accepted example with all the votes up. Adding in the assignment of
is not only useless since you don't even use the value, but it wastes processing time and increased the runtime by a few nanoseconds (which led to a 100-200 ms increase in the tests). I can't see why anyone would do that since it actually is extra work to reduce performance.
You'd think that would be optimized out... though maybe I should check the bytecode and see what the compiler is doing. That doesn't explain why it always showed up as lengthier for me though if it somehow is optimized out... therefore I wonder what's going on. As a note: By lengthier, I mean running the test for 10000000 iterations, and running that program multiple times (10x+) always showed it to be slower.
EDIT: Updated a test for Character.isDigit()
To match only positive base-ten integers, that contains only ASCII digits, use:
With Apache Commons Lang 3.5 and above:
NumberUtils.isCreatable
orStringUtils.isNumeric
.With Apache Commons Lang 3.4 and below:
NumberUtils.isNumber
orStringUtils.isNumeric
.You can also use
StringUtils.isNumericSpace
which returnstrue
for empty strings and ignores internal spaces in the string. (The linked javadocs contain detailed examples for each method.)If you using java to develop Android app, you could using TextUtils.isDigitsOnly function.
That's why I like the Try* approach in .NET. In addition to the traditional Parse method that's like the Java one, you also have a TryParse method. I'm not good in Java syntax (out parameters?), so please treat the following as some kind of pseudo-code. It should make the concept clear though.
Usage: