Fastest way to check if a String can be parsed to

2019-06-15 13:25发布

问题:

I know there's a million ways of doing this but what is the fastest? This should include scientific notation.

NOTE: I'm not interested in converting the value to Double, i'm only interested in knowing if it's possible. i.e. private boolean isDouble(String value).

回答1:

You can check it using the same regular expression the Double class uses. It's well documented here:

http://docs.oracle.com/javase/6/docs/api/java/lang/Double.html#valueOf%28java.lang.String%29

Here is the code part:

To avoid calling this method on an invalid string and having a NumberFormatException be thrown, the regular expression below can be used to screen the input string:

  final String Digits     = "(\\p{Digit}+)";
  final String HexDigits  = "(\\p{XDigit}+)";

        // an exponent is 'e' or 'E' followed by an optionally 
        // signed decimal integer.
        final String Exp        = "[eE][+-]?"+Digits;
        final String fpRegex    =
            ("[\\x00-\\x20]*"+  // Optional leading "whitespace"
             "[+-]?(" + // Optional sign character
             "NaN|" +           // "NaN" string
             "Infinity|" +      // "Infinity" string

             // A decimal floating-point string representing a finite positive
             // number without a leading sign has at most five basic pieces:
             // Digits . Digits ExponentPart FloatTypeSuffix
             // 
             // Since this method allows integer-only strings as input
             // in addition to strings of floating-point literals, the
             // two sub-patterns below are simplifications of the grammar
             // productions from the Java Language Specification, 2nd 
             // edition, section 3.10.2.

             // Digits ._opt Digits_opt ExponentPart_opt FloatTypeSuffix_opt
             "((("+Digits+"(\\.)?("+Digits+"?)("+Exp+")?)|"+

             // . Digits ExponentPart_opt FloatTypeSuffix_opt
             "(\\.("+Digits+")("+Exp+")?)|"+

       // Hexadecimal strings
       "((" +
        // 0[xX] HexDigits ._opt BinaryExponent FloatTypeSuffix_opt
        "(0[xX]" + HexDigits + "(\\.)?)|" +

        // 0[xX] HexDigits_opt . HexDigits BinaryExponent FloatTypeSuffix_opt
        "(0[xX]" + HexDigits + "?(\\.)" + HexDigits + ")" +

        ")[pP][+-]?" + Digits + "))" +
             "[fFdD]?))" +
             "[\\x00-\\x20]*");// Optional trailing "whitespace"

  if (Pattern.matches(fpRegex, myString))
            Double.valueOf(myString); // Will not throw NumberFormatException
        else {
            // Perform suitable alternative action
        }


回答2:

There is a handy NumberUtils#isNumber in Apache Commons Lang. It is a bit far fetched:

Valid numbers include hexadecimal marked with the 0x qualifier, scientific notation and numbers marked with a type qualifier (e.g. 123L).

but I guess it might be faster than regular expressions or throwing and catching an exception.



回答3:

The Apache Commons NumberUtil is actually quite fast. I'm guessing it's way faster than any regexp implementation.



回答4:

I use the following code to check if a string can be parsed to double:

public static boolean isDouble(String str) {
    if (str == null) {
        return false;
    }
    int length = str.length();
    if (length == 0) {
        return false;
    }
    int i = 0;
    if (str.charAt(0) == '-') {
        if (length == 1) {
            return false;
        }
        ++i;
    }
    int integerPartSize = 0;
    int exponentPartSize = -1;
    while (i < length) {
        char c = str.charAt(i);
        if (c < '0' || c > '9') {
            if (c == '.' && integerPartSize > 0 && exponentPartSize == -1) {
                exponentPartSize = 0;
            } else {
                return false;
            }
        } else if (exponentPartSize > -1) {
            ++exponentPartSize;
        } else {
            ++integerPartSize;
        }
        ++i;
    }
    if ((str.charAt(0) == '0' && i > 1 && exponentPartSize < 1)
            || exponentPartSize == 0 || (str.charAt(length - 1) == '.')) {
        return false;
    }
    return true;
}

I am aware that the output is not exactly the same as for the regular expression in the Double class but this method is much faster and the result is good enough for my needs. These are my unit tests for the method.

@Test
public void shouldReturnTrueIfStringIsDouble() {
    assertThat(Utils.isDouble("0.0")).isTrue();
    assertThat(Utils.isDouble("0.1")).isTrue();
    assertThat(Utils.isDouble("-0.0")).isTrue();
    assertThat(Utils.isDouble("-0.1")).isTrue();
    assertThat(Utils.isDouble("1.0067890")).isTrue();
    assertThat(Utils.isDouble("0")).isTrue();
    assertThat(Utils.isDouble("1")).isTrue();
}

@Test
public void shouldReturnFalseIfStringIsNotDouble() {
    assertThat(Utils.isDouble(".01")).isFalse();
    assertThat(Utils.isDouble("0.1f")).isFalse();
    assertThat(Utils.isDouble("a")).isFalse();
    assertThat(Utils.isDouble("-")).isFalse();
    assertThat(Utils.isDouble("-1.")).isFalse();
    assertThat(Utils.isDouble("-.1")).isFalse();
    assertThat(Utils.isDouble("123.")).isFalse();
    assertThat(Utils.isDouble("1.2.3")).isFalse();
    assertThat(Utils.isDouble("1,3")).isFalse();
}


回答5:

I think trying to convert it to a double and catching the exception would be the fastest way to check...another way I can think of, is splitting the string up by the period ('.') and then checking that each part of the split array contains only integers...but i think the first way would be faster



回答6:

I have tried below code block and seems like throwing exception more faster

String a = "123f15512551";
        System.out.println(System.currentTimeMillis());
        a.matches("^\\d+\\.\\d+$");
        System.out.println(System.currentTimeMillis());

        try{
            Double.valueOf(a);
        }catch(Exception e){
            System.out.println(System.currentTimeMillis());
        }

Output:

1324316024735
1324316024737
1324316024737


回答7:

Exceptions shouldn't be used for flow control, though Java's authors made it difficult to not use NumberFormatException that way.

The class java.util.Scanner has a method hasNextDouble to check whether a String can be read as a double.

Under the hood Scanner uses regular expressions (via pre-compiled patterns) to determine if a String can be converted to a integer or floating point number. The patterns are compiled in the method buildFloatAndDecimalPattern which you can view at GrepCode here.

A pre-compiled pattern has the added benefit of being faster than using a try/catch block.

Here's the method referenced above, in case GrepCode disappears one day:

private void buildFloatAndDecimalPattern() {
    // \\p{javaDigit} may not be perfect, see above
    String digit = "([0-9]|(\\p{javaDigit}))";
    String exponent = "([eE][+-]?"+digit+"+)?";
    String groupedNumeral = "("+non0Digit+digit+"?"+digit+"?("+
                            groupSeparator+digit+digit+digit+")+)";
    // Once again digit++ is used for performance, as above
    String numeral = "(("+digit+"++)|"+groupedNumeral+")";
    String decimalNumeral = "("+numeral+"|"+numeral +
        decimalSeparator + digit + "*+|"+ decimalSeparator +
        digit + "++)";
    String nonNumber = "(NaN|"+nanString+"|Infinity|"+
                           infinityString+")";
    String positiveFloat = "(" + positivePrefix + decimalNumeral +
                        positiveSuffix + exponent + ")";
    String negativeFloat = "(" + negativePrefix + decimalNumeral +
                        negativeSuffix + exponent + ")";
    String decimal = "(([-+]?" + decimalNumeral + exponent + ")|"+
        positiveFloat + "|" + negativeFloat + ")";
    String hexFloat =
        "[-+]?0[xX][0-9a-fA-F]*\\.[0-9a-fA-F]+([pP][-+]?[0-9]+)?";
    String positiveNonNumber = "(" + positivePrefix + nonNumber +
                        positiveSuffix + ")";
    String negativeNonNumber = "(" + negativePrefix + nonNumber +
                        negativeSuffix + ")";
    String signedNonNumber = "(([-+]?"+nonNumber+")|" +
                             positiveNonNumber + "|" +
                             negativeNonNumber + ")";
    floatPattern = Pattern.compile(decimal + "|" + hexFloat + "|" +
                                   signedNonNumber);
    decimalPattern = Pattern.compile(decimal);
}