As regards integer adding one-liners, several proposed shell scripting solutions exist;
however, on closer look at each of the solutions chosen, there are inherent limitations:
awk
ones would choke at arbitrary precision and integer size (it behaves C-like, afterall)bc
ones would rather be unhappy with arbitrarily long inputs:(sed 's/$/+\\/g';echo 0)|bc
Understanding that there may be issues of portability on top of that across platforms (see [1] [2]) which is undesired, is there a generic solution which is a winner on both practicality and brevity?
Hint: SunOS & MacOSX are examples where portability would be an issue.
fi. could dc
command permit to handle arbitrarily large 2^n, integer or otherwise, inputs?
[1] awk
: https://stackoverflow.com/a/450821/1574494 or https://stackoverflow.com/a/25245025/1574494 or Printing long integers in awk
An optimal solution for
dc(1)
sums the inputs as they are read:The one I usually use is
paste -sd+|bc
:(For strict Posix compliance,
paste
needs to be provided with an explicit argument:paste -sd+ -|bc
. Apparently that is necessary with the BSDpaste
implementation installed by default on OS X.)However, that will fail for larger inputs, because
bc
buffers an entire expression in memory before evaluating it. On my system,bc
ran out of memory trying to add 100 million numbers, although it was able to do 70 million. But other systems may have smaller capacities.Since
bc
has variables, you could avoid long lines by repetitively adding to a variable instead of constructing a single long expression. This is (as far as I know) 100% Posix compliant, but there is a 3x time penalty:Another way to handle the case where the input size exceeds
bc
's buffering capacity would be to use the standardxargs
tool to add the numbers in groups:The number of input lines used by each
xargs
evaluation will vary from system to system, but it will normally be in the hundreds and it might be much more. Obviously, thexargs | bc
invocations could be chained arbitrarily to increase capacity.It might be necessary to limit the size of the
xargs
expansion using the-s
switch, on systems whereARG_MAX
exceeds the capacity of thebc
command. Aside from performing an experiment to establish thebc
buffer limit, there is no portable way to establish what that limit might be but it certainly should be no less thanLINE_MAX
which is guaranteed to be at least 2048. Even with 100-digit addends, that will allow a reduction by a factor of 20, so a chain of 10xargs|bc
pipes would handle over 1013 addends assuming you were prepared to wait a couple of months for that to complete.As an alternative to constructing a large fixed-length pipeline, you could use a function to recursively pipe the output from
xargs|bc
until only one value is produced:If you use a very conservative value for
MAXLINE
, the above is quite slow, but with plausible larger values it is not much slower than the simplepaste|bc
solution:As well as the
bc
solutions, I timed some other possibilities. As shown above, with an input of 20 million numbers,paste|bc
took 10 seconds. That's almost identical to the time used by adding 20 million numbers withProgramming languages such as
python
andperl
proved to be faster:I was unable to test
dc -f - -e '[+z1<r]srz1<rp'
on large inputs, since its performance appears to be quadratic (or worse); it summed 25 thousand numbers in 3 seconds, but it took 19 seconds to sum 50 thousand and 90 seconds to do 100 thousand.Although
bc
is not the fastest and memory limitations require awkward workarounds, it has the advantage of working out of the box on Posix-compliant systems without the necessity to install enhanced versions of any standard utility (awk
) or programming languages not required by Posix (perl
andpython
).$ seq 1000|(sum=0;while read num; do sum=`echo $sum+$num|bc -l`;done;echo $sum) 500500
Also, this one will not win a top-speed prize, however it IS:
with a stretch, you may call it elegant too ;-) come on guys, show the better way to do this!
You can use
gawk
with the -M flag:Or
Perl
with bignum enabled:It seems that the following does the trick:
but, is it the optimal solution?