I'm trying to write a program which calculates the Taylor series of exp(-x) and exp(x) up to 200 iterations, for large x. (exp(x)=1+x+x^2/2+...).
My program is really simple, and it seems like it should work perfectly. However it diverges for exp(-x), but converges just fine for exp(+x). Here is my code so far:
long double x = 100.0, sum = 1.0, last = 1.0;
for(int i = 1; i < 200; i++) {
last *= x / i; //multiply the last term in the series from the previous term by x/n
sum += last; //add this new last term to the sum
}
cout << "exp(+x) = " << sum << endl;
x = -100.0; //redo but now letting x<0
sum = 1.0;
last = 1.0;
for(int i = 1; i < 200; i++) {
last *= x / i;
sum += last;
}
cout << "exp(-x) = " << sum << endl;
And when I run it, I get the following output:
exp(+x) = 2.68811691354e+43
exp(-x) = -8.42078025179e+24
When the actual values are:
exp(+x) = 2.68811714182e+43
exp(-x) = 3.72007597602e-44
As you can see, it works just fine for the positive calculation, but not negative. Does anyone have any ideas on why rounding errors could go so wrong by just adding a negative on every other term? Also, is there anything I could implement to fix this issue?
Thanks in advance!!
I don't think this actually has anything to do with floating point approximation error, I think there is another more significant source of error.
As you say yourself, your method is really simple. You are taking a taylor series approximation at x=0
to this function, and then evaluating it at x=-100
.
How accurate did you actually expect this method to be and why?
At a high level, you should only expect your method to be accurate in a narrow region near x=0
. Taylor's approximation theorem tells you that e.g. if you take N
terms of the series around x=0
, your approximation will be accurate to O(|x|)^(N+1)
at least. So if you take 200 terms, you should be accurate to e.g. within 10^(-60)
or so in the range [-0.5, 0.5]
. But at x=100
Taylor's theorem only gives you a pretty terrible bound.
Conceptually, you know that e^{-x}
tends to zero as x
goes to minus infinity. But your approximation function is a polynomial of a fixed degree, and any non-constant polynomial tends to either plus infinity or minus infinity asymptotically. So the relative error must be unbounded if you consider the whole range of possible values of x
.
So in short I think you should rethink your approach. One thing you might consider is, only use the Taylor series method for x
values satisfying -0.5f <= x <= 0.5f
. And for any x
greater than 0.5f
, you try dividing x
by two and calling the function recursively, then squaring the result. Or something like this.
For best results you should probably use an established method.
Edit:
I decided to write some code to see how well my idea actually works. It appears to match perfectly with the C library implementation across the range x = -10000
to x = 10000
at least to as many bits of precision as I am displaying. :)
Note also that my method is accurate even for values of x
greater than 100, where the Taylor Series method actually loses accuracy on the positive end as well.
#include <cmath>
#include <iostream>
long double taylor_series(long double x)
{
long double sum = 1.0, last = 1.0;
for(int i = 1; i < 200; i++) {
last *= x / i; //multiply the last term in the series from the previous term by x/n
sum += last; //add this new last term to the sum
}
return sum;
}
long double hybrid(long double x)
{
long double temp;
if (-0.5 <= x && x <= 0.5) {
return taylor_series(x);
} else {
temp = hybrid(x / 2);
return (temp * temp);
}
}
long double true_value(long double x) {
return expl(x);
}
void output_samples(long double x) {
std::cout << "x = " << x << std::endl;
std::cout << "\ttaylor series = " << taylor_series(x) << std::endl;
std::cout << "\thybrid method = " << hybrid(x) << std::endl;
std::cout << "\tlibrary = " << true_value(x) << std::endl;
}
int main() {
output_samples(-10000);
output_samples(-1000);
output_samples(-100);
output_samples(-10);
output_samples(-1);
output_samples(-0.1);
output_samples(0);
output_samples(0.1);
output_samples(1);
output_samples(10);
output_samples(100);
output_samples(1000);
output_samples(10000);
}
Output:
$ ./main
x = -10000
taylor series = -2.48647e+423
hybrid method = 1.13548e-4343
library = 1.13548e-4343
x = -1000
taylor series = -2.11476e+224
hybrid method = 5.07596e-435
library = 5.07596e-435
x = -100
taylor series = -8.49406e+24
hybrid method = 3.72008e-44
library = 3.72008e-44
x = -10
taylor series = 4.53999e-05
hybrid method = 4.53999e-05
library = 4.53999e-05
x = -1
taylor series = 0.367879
hybrid method = 0.367879
library = 0.367879
x = -0.1
taylor series = 0.904837
hybrid method = 0.904837
library = 0.904837
x = 0
taylor series = 1
hybrid method = 1
library = 1
x = 0.1
taylor series = 1.10517
hybrid method = 1.10517
library = 1.10517
x = 1
taylor series = 2.71828
hybrid method = 2.71828
library = 2.71828
x = 10
taylor series = 22026.5
hybrid method = 22026.5
library = 22026.5
x = 100
taylor series = 2.68812e+43
hybrid method = 2.68812e+43
library = 2.68812e+43
x = 1000
taylor series = 3.16501e+224
hybrid method = 1.97007e+434
library = 1.97007e+434
x = 10000
taylor series = 2.58744e+423
hybrid method = 8.80682e+4342
library = 8.80682e+4342
Edit:
For who is interested:
There was some question raised in the comments about how significant the floating point errors are in the original program. My original assumption was that they were negligible -- I did a test to see if that's true or not. It turns out that's not true, and there is significant floating point error going on, but that even without the floating point errors there are significant errors being introduced just by the taylor series alone. The true value of the taylor series at 200 terms at x=-100
seems to be near to -10^{24}
, not 10^{-44}
. I checked this using boost::multiprecision::cpp_rational
, which is an arbitrary precision rational type built on their arbitrary precision integer type.
Output:
x = -100
taylor series (double) = -8.49406e+24
(rational) = -18893676108550916857809762858135399218622904499152741157985438973568808515840901824148153378967545615159911801257288730703818783811465589393308637433853828075746484162303774416145637877964256819225743503057927703756503421797985867950089388433370741907279634245166982027749118060939789786116368342096247737/2232616279628214542925453719111453368125414939204152540389632950466163724817295723266374721466940218188641069650613086131881282494641669993119717482562506576264729344137595063634080983904636687834775755173984034571100264999493261311453647876869630211032375288916556801211263293563
= -8.46257e+24
library = 3.72008e-44
x = 100
taylor series (double) = 2.68812e+43
(rational) = 36451035284924577938246208798747009164319474757880246359883694555113407009453436064573518999387789077985197279221655719227002367495061633272603038249747260895707250896595889294145309676586627989388740458641362406969609459453916777341749316070359589697827702813520519796940239276744754778199440304584107317957027129587503199/1356006206645357299077422810994072904566969809700681604285727988319939931024001696953196916719184549697395496290863162742676361760549235149195411231740418104602504325580502523311497039304043141691060121240640609954226541318710631103275528465092597490136227936213123455950399178299
= 2.68812e+43
library = 2.68812e+43
Code:
#include <cmath>
#include <iostream>
#include <boost/multiprecision/cpp_int.hpp>
typedef unsigned int uint;
typedef boost::multiprecision::cpp_rational rational;
// Taylor series of exp
template <typename T>
T taylor_series(const T x) {
T sum = 1, last = 1;
for (uint i = 1; i < 200; i++) {
last = last * (x / i);
sum = sum + last;
}
return sum;
}
void sample(const int x) {
std::cout << "x = " << x << std::endl;
long double e1 = taylor_series(static_cast<long double>(x));
std::cout << "\ttaylor series (double) = " << e1 << std::endl;
rational e2 = taylor_series(static_cast<rational>(x));
std::cout << "\t (rational) = " << e2 << std::endl;
std::cout << "\t = " << static_cast<long double>(e2) << std::endl;
std::cout << "\tlibrary = " << expl(static_cast<long double>(x)) << std::endl;
}
int main() {
sample(-100);
sample(100);
}
One of the most common operations that produce round-off error is the subtraction of nearly equal numbers. When computing exp(-100), the higher order terms are nearly equal, so the round off error can quickly accumulate. This is compounded by the fact, that the later iterations are operating out of the range of the long double type. According to microsoft, the range of the long double is from 1.7E +/- 308 (15 digits), while 200! is 7.89 x 10^374.
If you are simply looking for an exp implementation, the math.h library has an implementation.
Using Taylor polynomials probably isn't a great idea for this; see http://www.embeddedrelated.com/showarticle/152.php for a great article on using Chebyshev polynomials for function approximation.
rickandross pointed out the source of the error in this case, namely that the Taylor expansion for exp(-100) involves differences of large values.
There's a simple modification to the Taylor attempt that get reasonable answers for the few test cases I tried, namely using the fact that exp(-x) = 1/exp(x). This programme:
#include <iostream>
#include <cmath>
double texp(double x)
{
double last=1.0;
double sum=1.0;
if(x<0)
return 1/texp(-x);
for(int i = 1; i < 200; i++) {
last *= x / i;
sum += last;
}
return sum;
}
void test_texp(double x)
{
double te=texp(x);
double e=std::exp(x);
double err=te-e;
double rerr=(te-e)/e;
std::cout << "x=" << x
<< "\ttexp(x)=" << te
<< "\texp(x)=" << e
<< "\terr=" << err
<< "\trerr=" << rerr
<< "\n";
}
int main()
{
test_texp(0);
test_texp(1);
test_texp(-1);
test_texp(100);
test_texp(-100);
}
gives this output (note that double precision is around 2e-16):
x=0 texp(x)=1 exp(x)=1 err=0 rerr=0
x=1 texp(x)=2.71828 exp(x)=2.71828 err=4.44089e-16 rerr=1.63371e-16
x=-1 texp(x)=0.367879 exp(x)=0.367879 err=-5.55112e-17 rerr=-1.50895e-16
x=100 texp(x)=2.68812e+43 exp(x)=2.68812e+43 err=1.48553e+28 rerr=5.52628e-16
x=-100 texp(x)=3.72008e-44 exp(x)=3.72008e-44 err=-2.48921e-59 rerr=-6.69128e-16