I'm trying to do something common enough: Parse user input in a shell script. If the user provided a valid integer, the script does one thing, and if not valid, it does something else. Trouble is, I haven't found an easy (and reasonably elegant) way of doing this - I don't want to have to pick it apart char by char.
I know this must be easy but I don't know how. I could do it in a dozen languages, but not BASH!
In my research I found this:
Regular expression to test whether a string consists of a valid real number in base 10
And there's an answer therein that talks about regex, but so far as I know, that's a function available in C (among others). Still, it had what looked like a great answer so I tried it with grep, but grep didn't know what to do with it. I tried -P which on my box means to treat it as a PERL regexp - nada. Dash E (-E) didn't work either. And neither did -F.
Just to be clear, I'm trying something like this, looking for any output - from there, I'll hack up the script to take advantage of whatever I get. (IOW, I was expecting that a non-conforming input returns nothing while a valid line gets repeated.)
snafu=$(echo "$2" | grep -E "/^[-+]?(?:\.[0-9]+|(?:0|[1-9][0-9]*)(?:\.[0-9]*)?)$/")
if [ -z "$snafu" ] ;
then
echo "Not an integer - nothing back from the grep"
else
echo "Integer."
fi
Would someone please illustrate how this is most easily done?
Frankly, this is a short-coming of TEST, in my opinion. It should have a flag like this
if [ -I "string" ] ;
then
echo "String is a valid integer."
else
echo "String is not a valid integer."
fi
For portability to pre-Bash 3.1 (when the
=~
test was introduced), useexpr
.expr STRING : REGEX
searches for REGEX anchored at the start of STRING, echoing the first group (or length of match, if none) and returning success/failure. This is old regex syntax, hence the excess\
.-\?
means "maybe-
",[0-9]\+
means "one or more digits", and$
means "end of string".Bash also supports extended globs, though I don't recall from which version onwards.
@(-|)
means "-
or nothing",[0-9]
means "digit", and*([0-9])
means "zero or more digits".For me, the simplest solution was to use the variable inside a
(())
expression, as so:Of course, this solution is only valid if a value of zero doesn't make sense for your application. That happened to be true in my case, and this is much simpler than the other solutions.
As pointed out in the comments, this can make you subject to a code execution attack: The
(( ))
operator evaluatesVAR
, as stated in theArithmetic Evaluation
section of the bash(1) man page. Therefore, you should not use this technique when the source of the contents ofVAR
is uncertain (nor should you use ANY other form of variable expansion, of course).or with sed:
Adding to the answer from Ignacio Vazquez-Abrams. This will allow for the + sign to precede the integer, and it will allow any number of zeros as decimal points. For example, this will allow +45.00000000 to be considered an integer.
However, $1 must be formatted to contain a decimal point. 45 is not considered an integer here, but 45.0 is.
For laughs I roughly just quickly worked out a set of functions to do this (is_string, is_int, is_float, is alpha string, or other) but there are more efficient (less code) ways to do this:
Run through some tests here, I defined that -44 is an int but 44- isn't etc.. :
Output:
NOTE: Leading 0's could infer something else when adding numbers such as octal so it would be better to strip them if you intend on treating '09' as an int (which I'm doing) (eg
expr 09 + 0
or strip with sed)