How to get bc to handle numbers in scientific (aka

2020-02-03 04:54发布

站内文章 / 前端开发

79 0

爷的心禁止访问

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

bc doesn't like numbers expressed in scientific notation (aka exponential notation).

$ echo "3.1e1*2" | bc -l
(standard_in) 1: parse error

but I need to use it to handle a few records that are expressed in this notation. Is there a way to get bc to understand exponential notation? If not, what can I do to translate them into a format that bc will understand?

回答1:

Unfortunately, bc doesn't support scientific notation.

However, it can be translated into a format that bc can handle, using extended regex as per POSIX in sed:

sed -E 's/([+-]?[0-9.]+)[eE]\+?(-?)([0-9]+)/(\1*10^\2\3)/g' <<<"$value"

you can replace the "e" (or "e+", if the exponent is positive) with "*10^", which bc will promptly understand. This works even if the exponent is negative or if the number is subsequently multiplied by another power, and allows keeping track of significant digits.

If you need to stick to basic regex (BRE), then this should be used:

sed 's/\([+-]\{0,1\}[0-9]*\.\{0,1\}[0-9]\{1,\}\)[eE]+\{0,1\}\(-\{0,1\}\)\([0-9]\{1,\}\)/(\1*10^\2\3)/g' <<<"$value"

From Comments:

A simple bash pattern match could not work (thanks @mklement0) as there is no way to match a e+ and keep the - from a e- at the same time.

A correctly working perl solution (thanks @mklement0)

$ perl -pe 's/([-\d.]+)e(?:\+|(-))?(\d+)/($1*10^$2$3)/gi' <<<"$value"

Thanks to @jwpat7 and @Paul Tomblin for clarifying aspects of sed's syntax, as well as @isaac and @mklement0 for improving the answer.

Edit:

The answer changed quite a bit over the years. The answer above is the latest iteration as of 17th May 2018. Previous attempts reported here were a solution in pure bash (by @ormaaj) and one in sed (by @me), that fail in at least some cases. I'll keep them here just to make sense of the comments, which contain much nicer explanations of the intricacies of all this than this answer does.

value=${value/[eE]+*/*10^}  ------> Can not work.
value=`echo ${value} | sed -e 's/[eE]+*/\\*10\\^/'` ------> Fail in some conditions

回答2:

Let me try to summarize the existing answers, with comments on each below:

(a) If you indeed need to use bc for arbitrary-precision calculations - as the OP does - use the OP's own clever approach, which textually reformats the scientific notation to an equivalent expression that bc understands.
If potentially losing precision is not a concern,
- (b) consider using awk or perl as bc alternatives; both natively understand scientific notation, as demonstrated in jwpat7's answer for awk.
- (c) consider using printf '%.<precision>f' to simply textually convert to regular floating point representation (decimal fractions, without the e/E) (a solution proposed in a since-deleted post by ormaaj).

(a) Reformatting scientific notation to an equivalent `bc` expression

The advantage of this solution is that precision is preserved: the textual representation is transformed into an equivalent textual representation that bc can understand, and bc itself is capable of arbitrary-precision calculations.

See the OP's own answer, whose updated form is now capable of transforming an entire expression containing multiple numbers in exponential notation into an equivalent bc expression.

(b) Using `awk` or `perl` instead of `bc` as the calculator

Note: The following approaches assume use of the built-in support for double-precision floating-point values in awk and perl. As is in inherent in floating-point arithmetic,
^{"given any fixed number of bits, most calculations with real numbers will produce quantities that cannot be exactly represented using that many bits. Therefore the result of a floating-point calculation must often be rounded in order to fit back into its finite representation. This rounding error is the characteristic feature of floating-point computation." (http://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html)}

That said,

GNU awk offers the option to be built with support for arbitrary-precision arithmetic - see https://www.gnu.org/software/gawk/manual/html_node/Gawk-and-MPFR.html; however, distributions may or may not include that support - verify support by checking the output from gawk --version for GNU MPFR and GNU MP.
If support is available, you must activate it with -M (--bignum) in a given invocation.
Perl offers optional arbitrary-precision decimal support via the Math::BigFloat package - see https://metacpan.org/pod/Math::BigFloat

awk

awk natively understands decimal exponential (scientific) notation.
(You should generally only use decimal representation, because awk implementations differ with respect to whether they support number literals with other bases.)

awk 'BEGIN { print 3.1e1 * 2 }'  # -> 62

If you use the default print function, the OFMT variable controls the output format by way of a printf format string; the (POSIX-mandated) default is %.6g, meaning 6 significant digits, which notably includes the digits in the integer part.

Note that if the number in scientific notation is supplied as input (as opposed to a literal part of the awk program), you must add +0 to force it to the default output format, if used by itself with print:

^{Depending on your locale and the awk implementation you use, you may have to replace the decimal point (.) with the locale-appropriate radix character, such as , in a German locale; applies to BSD awk, mawk, and to GNU awk with the --posix option.}

awk '{ print $1+0 }' <<<'3.1e1' # -> 31; without `+0`, output would be the same as input

Modifying variable OFMT changes the default output format (for numbers with fractional parts; (effective) integers are always output as such).
Alternatively, use the printf function with an explicit output format:

awk 'BEGIN { printf "%.4f", 3.1e1 * 2.1234 }' # -> 65.8254

Perl

perl too natively understands decimal exponential (scientific) notation.

Note: Perl, unlike awk, isn't available on all POSIX-like platforms by default; furthermore, it's not as lightweight as awk.
However, it offers more features than awk, such as natively understanding hexadecimal and octal integers.

perl -le 'print 3.1e1 * 2'  # -> 62

I'm unclear on what Perl's default output format is, but it appears to be %.15g. As with awk, you can use printf to choose the desired output format:

perl -e 'printf "%.4f\n", 3.1e1 * 2.1234' # -> 65.8254

(c) Using `printf` to convert scientific notation to decimal fractions

If you simply want to convert scientific notation (e.g., 1.2e-2) into a decimal fraction (e.g., 0.012), printf '%f' can do that for you. Note that you'll convert one textual representation into another via floating-point arithmetic, which is subject to the same rounding errors as the awk and perl approaches.

printf '%.4f' '1.2e-2' # -> '0.0120'; `.4` specifies 4 decimal digits.

回答3:

One can use awk for this; for example,

awk '{ print +$1, +$2, +$3 }' <<< '12345678e-6 0.0314159e2 54321e+13'

produces (via awk's default format %.6g) output like
12.3457 3.14159 543210000000000000
while commands like the following two produce the output shown after each, given that file edata contains data as shown later.

$ awk '{for(i=1;i<=NF;++i)printf"%.13g ",+$i; printf"\n"}' < edata`
31 0.0312 314.15 0 
123000 3.1415965 7 0.04343 0 0.1 
1234567890000 -56.789 -30 

$ awk '{for(i=1;i<=NF;++i)printf"%9.13g ",+$i; printf"\n"}' < edata
       31    0.0312    314.15         0 
   123000 3.1415965         7   0.04343         0       0.1 
1234567890000   -56.789       -30 


$ cat edata 
3.1e1 3.12e-2 3.1415e+2 xyz
123e3 0.031415965e2 7 .4343e-1 0e+0 1e-1
.123456789e13 -56789e-3 -30

Also, regarding solutions using sed, it probably is better to delete the plus sign in forms like 45e+3 at the same time as the e, via regex [eE]+*, rather than in a separate sed expression. For example, on my linux machine with GNU sed version 4.2.1 and bash version 4.2.24, commands
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34'
sed 's/[eE]+*/*10^/g' <<< '7.11e-2 + 323e+34' | bc -l
produce output
7.11*10^-2 + 323*10^34
3230000000000000000000000000000000000.07110000000000000000

回答4:

You can also define a bash function which calls awk (a good name would be the equal sign "="):

= ()
{
    local in="$(echo "$@" | sed -e 's/\[/(/g' -e 's/\]/)/g')";
    awk 'BEGIN {print '"$in"'}' < /dev/null
}

Then you can use all type of floating point math in the shell. Note that square brackets are used here instead of round brackets, since the latter would have to be protected from the bash by quotes.

> = 1+sin[3.14159] + log[1.5] - atan2[1,2] - 1e5 + 3e-10
0.94182

Or in a script to assign the result

a=$(= 1+sin[4])
echo $a   # 0.243198

回答5:

Luckily there is printf, which does the formatting job:

The above example:

printf "%.12f * 2\n" 3.1e1 | bc -l

Or a float comparison:

n=8.1457413437133669e-02
m=8.1456839223809765e-02

n2=`printf "%.12f" $n`
m2=`printf "%.12f" $m`

if [ $(echo "$n2 > $m2" | bc -l) == 1  ]; then 
   echo "n is bigger"
else
   echo "m is bigger"
fi

回答6:

Piping version of OPs accepted answer

$ echo 3.82955e-5 | sed 's/[eE]+*/\*10\^/'
3.82955*10^-5

Piping the input to the OPs accepted sed command gave extra backslashes like

$ echo 3.82955e-5 | sed 's/[eE]+*/\\*10\\^/'
3.82955\*10\^-5

回答7:

try this (found this in an example for a CFD input data for processing with m4:)

T0=4e-5
deltaT=2e-6
m4 <<< "esyscmd(perl -e 'printf (${T0} + ${deltaT})')"

回答8:

Try this: (using bash)

printf "scale=20\n0.17879D-13\n" | sed -e 's/D/*10^/' | bc

or this:

 num="0.17879D-13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
.00000000000001787900
num="1230.17879"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D/*10^/' | bc`" ; echo $convert
1230.17879

If you have positive exponents you should use this:

num="0.17879D+13"; convert="`printf \"scale=20\n$num\n\" | sed -e 's/D+/*10^/' -e 's/D/*10^/' | bc`" ; echo $convert
1787900000000.00000

That last one would handle every numbers thrown at it. You can adapt the 'sed' if you have numbers with 'e' or 'E' as exponents.

You get to chose the scale you want.

回答9:

I managed to do it with a little hack. You can do something like this -

scientific='4.8844221e+002'
base=$(echo $scientific | cut -d 'e' -f1)
exp=$(($(echo $scientific | cut -d 'e' -f2)*1))
converted=$(bc -l <<< "$base*(10^$exp)")
echo $converted 
>> 488.4422100