Test whether string is a valid integer

2019-01-02 20:21发布

问题:

I'm trying to do something common enough: Parse user input in a shell script. If the user provided a valid integer, the script does one thing, and if not valid, it does something else. Trouble is, I haven't found an easy (and reasonably elegant) way of doing this - I don't want to have to pick it apart char by char.

I know this must be easy but I don't know how. I could do it in a dozen languages, but not BASH!

In my research I found this:

Regular expression to test whether a string consists of a valid real number in base 10

And there's an answer therein that talks about regex, but so far as I know, that's a function available in C (among others). Still, it had what looked like a great answer so I tried it with grep, but grep didn't know what to do with it. I tried -P which on my box means to treat it as a PERL regexp - nada. Dash E (-E) didn't work either. And neither did -F.

Just to be clear, I'm trying something like this, looking for any output - from there, I'll hack up the script to take advantage of whatever I get. (IOW, I was expecting that a non-conforming input returns nothing while a valid line gets repeated.)

snafu=$(echo "$2" | grep -E "/^[-+]?(?:\.[0-9]+|(?:0|[1-9][0-9]*)(?:\.[0-9]*)?)$/")
if [ -z "$snafu" ] ;
then
   echo "Not an integer - nothing back from the grep"
else
   echo "Integer."
fi

Would someone please illustrate how this is most easily done?

Frankly, this is a short-coming of TEST, in my opinion. It should have a flag like this

if [ -I "string" ] ;
then
   echo "String is a valid integer."
else
   echo "String is not a valid integer."
fi

回答1:

[[ $var =~ ^-?[0-9]+$ ]]
  • The ^ indicates the beginning of the input pattern
  • The - is a literal "-"
  • The ? means "0 or 1 of the preceding (-)"
  • The + means "1 or more of the preceding ([0-9])"
  • The $ indicates the end of the input pattern

So the regex matches an optional - (for the case of negative numbers), followed by one or more decimal digits.

References:

  • http://www.tldp.org/LDP/abs/html/bashver3.html#REGEXMATCHREF


回答2:

Wow... there are so many good solutions here!! Of all the solutions above, I agree with @nortally that using the -eq one liner is the coolest.

I am running GNU bash, version 4.1.5 (Debian). I have also checked this on ksh (SunSO 5.10).

Here is my version of checking if $1 is an integer or not:

if [ "$1" -eq "$1" ] 2>/dev/null
then
    echo "$1 is an integer !!"
else
    echo "ERROR: first parameter must be an integer."
    echo $USAGE
    exit 1
fi

This approach also accounts for negative numbers, which some of the other solutions will have a faulty negative result, and it will allow a prefix of "+" (e.g. +30) which obviously is an integer.

Results:

$ int_check.sh 123
123 is an integer !!

$ int_check.sh 123+
ERROR: first parameter must be an integer.

$ int_check.sh -123
-123 is an integer !!

$ int_check.sh +30
+30 is an integer !!

$ int_check.sh -123c
ERROR: first parameter must be an integer.

$ int_check.sh 123c
ERROR: first parameter must be an integer.

$ int_check.sh c123
ERROR: first parameter must be an integer.

The solution provided by Ignacio Vazquez-Abrams was also very neat (if you like regex) after it was explained. However, it does not handle positive numbers with the + prefix, but it can easily be fixed as below:

[[ $var =~ ^[-+]?[0-9]+$ ]]


回答3:

Latecomer to the party here. I'm extremely surprised none of the answers mention the simplest, fastest, most portable solution; the case statement.

case ${variable#[-+]} in
  *[!0-9]* | '') echo Not a number ;;
  * ) echo Valid number ;;
esac

The trimming of any sign before the comparison feels like a bit of a hack, but that makes the expression for the case statement so much simpler.



回答4:

For portability to pre-Bash 3.1 (when the =~ test was introduced), use expr.

if expr "$string" : '-\?[0-9]\+$' >/dev/null
then
  echo "String is a valid integer."
else
  echo "String is not a valid integer."
fi

expr STRING : REGEX searches for REGEX anchored at the start of STRING, echoing the first group (or length of match, if none) and returning success/failure. This is old regex syntax, hence the excess \. -\? means "maybe -", [0-9]\+ means "one or more digits", and $ means "end of string".

Bash also supports extended globs, though I don't recall from which version onwards.

shopt -s extglob
case "$string" of
    @(-|)[0-9]*([0-9]))
        echo "String is a valid integer." ;;
    *)
        echo "String is not a valid integer." ;;
esac

# equivalently, [[ $string = @(-|)[0-9]*([0-9])) ]]

@(-|) means "- or nothing", [0-9] means "digit", and *([0-9]) means "zero or more digits".



回答5:

I like the solution using the -eq test, because it's basically a one-liner.

My own solution was to use parameter expansion to throw away all the numerals and see if there was anything left. (I'm still using 3.0, haven't used [[ or expr before, but glad to meet them.)

if [ "${INPUT_STRING//[0-9]}" = "" ]; then
  # yes, natural number
else
  # no, has non-numeral chars
fi


回答6:

Here's yet another take on it (only using the test builtin command and its return code):

function is_int() { return $(test "$@" -eq "$@" > /dev/null 2>&1); } 

input="-123"

if $(is_int "${input}");
then
   echo "Input: ${input}"
   echo "Integer: $[${input}]"
else
   echo "Not an integer: ${input}"
fi


回答7:

You can strip non-digits and do a comparison. Here's a demo script:

for num in "44" "-44" "44-" "4-4" "a4" "4a" ".4" "4.4" "-4.4" "09"
do
    match=${num//[^[:digit:]]}    # strip non-digits
    match=${match#0*}             # strip leading zeros
    echo -en "$num\t$match\t"
    case $num in
        $match|-$match)    echo "Integer";;
                     *)    echo "Not integer";;
    esac
done

This is what the test output looks like:

44      44      Integer
-44     44      Integer
44-     44      Not integer
4-4     44      Not integer
a4      4       Not integer
4a      4       Not integer
.4      4       Not integer
4.4     44      Not integer
-4.4    44      Not integer
09      9       Not integer


回答8:

For me, the simplest solution was to use the variable inside a (()) expression, as so:

if ((VAR > 0))
then
  echo "$VAR is a positive integer."
fi

Of course, this solution is only valid if a value of zero doesn't make sense for your application. That happened to be true in my case, and this is much simpler than the other solutions.

As pointed out in the comments, this can make you subject to a code execution attack: The (( )) operator evaluates VAR, as stated in the Arithmetic Evaluation section of the bash(1) man page. Therefore, you should not use this technique when the source of the contents of VAR is uncertain (nor should you use ANY other form of variable expansion, of course).



回答9:

or with sed:

   test -z $(echo "2000" | sed s/[0-9]//g) && echo "integer" || echo "no integer"
   # integer

   test -z $(echo "ab12" | sed s/[0-9]//g) && echo "integer" || echo "no integer"
   # no integer


回答10:

Adding to the answer from Ignacio Vazquez-Abrams. This will allow for the + sign to precede the integer, and it will allow any number of zeros as decimal points. For example, this will allow +45.00000000 to be considered an integer.
However, $1 must be formatted to contain a decimal point. 45 is not considered an integer here, but 45.0 is.

if [[ $1 =~ ^-?[0-9]+.?[0]+$ ]]; then
    echo "yes, this is an integer"
elif [[ $1 =~ ^\+?[0-9]+.?[0]+$ ]]; then
    echo "yes, this is an integer"
else
    echo "no, this is not an integer"
fi


回答11:

For laughs I roughly just quickly worked out a set of functions to do this (is_string, is_int, is_float, is alpha string, or other) but there are more efficient (less code) ways to do this:

#!/bin/bash

function strindex() {
    x="${1%%$2*}"
    if [[ "$x" = "$1" ]] ;then
        true
    else
        if [ "${#x}" -gt 0 ] ;then
            false
        else
            true
        fi
    fi
}

function is_int() {
    if is_empty "${1}" ;then
        false
        return
    fi
    tmp=$(echo "${1}" | sed 's/[^0-9]*//g')
    if [[ $tmp == "${1}" ]] || [[ "-${tmp}" == "${1}" ]] ; then
        #echo "INT (${1}) tmp=$tmp"
        true
    else
        #echo "NOT INT (${1}) tmp=$tmp"
        false
    fi
}

function is_float() {
    if is_empty "${1}" ;then
        false
        return
    fi
    if ! strindex "${1}" "-" ; then
        false
        return
    fi
    tmp=$(echo "${1}" | sed 's/[^a-z. ]*//g')
    if [[ $tmp =~ "." ]] ; then
        #echo "FLOAT  (${1}) tmp=$tmp"
        true
    else
        #echo "NOT FLOAT  (${1}) tmp=$tmp"
        false
    fi
}

function is_strict_string() {
    if is_empty "${1}" ;then
        false
        return
    fi
    if [[ "${1}" =~ ^[A-Za-z]+$ ]]; then
        #echo "STRICT STRING (${1})"
        true
    else
        #echo "NOT STRICT STRING (${1})"
        false
    fi
}

function is_string() {
    if is_empty "${1}" || is_int "${1}" || is_float "${1}" || is_strict_string "${1}" ;then
        false
        return
    fi
    if [ ! -z "${1}" ] ;then
        true
        return
    fi
    false
}
function is_empty() {
    if [ -z "${1// }" ] ;then
        true
    else
        false
    fi
}

Run through some tests here, I defined that -44 is an int but 44- isn't etc.. :

for num in "44" "-44" "44-" "4-4" "a4" "4a" ".4" "4.4" "-4.4" "09" "hello" "h3llo!" "!!" " " "" ; do
    if is_int "$num" ;then
        echo "INT = $num"

    elif is_float "$num" ;then
        echo "FLOAT = $num"

    elif is_string "$num" ; then
        echo "STRING = $num"

    elif is_strict_string "$num" ; then
        echo "STRICT STRING = $num"
    else
        echo "OTHER = $num"
    fi
done

Output:

INT = 44
INT = -44
STRING = 44-
STRING = 4-4
STRING = a4
STRING = 4a
FLOAT = .4
FLOAT = 4.4
FLOAT = -4.4
INT = 09
STRICT STRING = hello
STRING = h3llo!
STRING = !!
OTHER =  
OTHER = 

NOTE: Leading 0's could infer something else when adding numbers such as octal so it would be better to strip them if you intend on treating '09' as an int (which I'm doing) (eg expr 09 + 0 or strip with sed)



标签: