Integrate () in R gives a horribly wrong answer

Question

Integrate () in R gives a horribly wrong answer

I was trying to integrate the following function from -infinity to infinity. The answer should be 0.2, but R gives a ridiculously small number. What's wrong?

 >f=function(x){exp(-10*abs(x-25))}
 >integrate(f,-Inf,Inf)
 5.329164e-15 with absolute error < 1e-14

+3

r

Patrick 24 Sep 14 at 18:27

source to share

2 answers

Rich scriven · Answer 1 · 2014-09-24T18:39:28+0000

I need a little longer to fully explain this and hopefully other users will add this wiki.

The off ?integrate

argument abs.tol

is defined as

absolute precision requested.

And further down the following note:

When integrating over infinite intervals, do this explicitly, rather than just using a large number as the end point. This increases the likelihood of the correct answer - any function whose integral is finite over an infinite interval should be almost zero for most of this interval.

So, if you want absolute precision, not relative precision (which is defined as the result from .Machine$double.eps^0.25

), you can do

> integrate(f, Inf, -Inf, abs.tol = 0L)
0.2 with absolute error < 8.4e-06

The default argument for abs.tol

is passed from rel.tol

, which is.Machine$double.eps^0.25

Let's see what happens inside "a little".

ifoo<-integrate(f,-Inf,Inf,abs.tol=1e-20)
5.275825e-21 with absolute error < 9.8e-21
str(ifoo)
List of 5
 $ value       : num 5.28e-21
 $ abs.error   : num 9.81e-21
 $ subdivisions: int 3
 $ message     : chr "OK"
 $ call        : language integrate(f = f, lower = -Inf, upper = Inf, abs.tol = 1e-20)
 - attr(*, "class")= chr "integrate"

ifoo<-integrate(f,-Inf,Inf,abs.tol=1e-40)
0.2 with absolute error < 8.4e-06
str(ifoo)
List of 5
 $ value       : num 0.2
 $ abs.error   : num 8.36e-06
 $ subdivisions: int 21
 $ message     : chr "OK"
 $ call        : language integrate(f = f, lower = -Inf, upper = Inf, abs.tol = 1e-40)
 - attr(*, "class")= chr "integrate"

Note the sudden jump in the number of divisions. In general, more subsections means better precision, which is, after all, a calculus point: reduce the width of the split to nothing to get the exact answer. I assume that with a large (ish) abs.tol

, only a few subsections are required to compute the calculated value for the calculated value to agree with some "estimated tolerance error", but when the required tolerance becomes small enough, more subsections are added. "

Edit: Thanks to Hong Ooi who actually looked at the integrand in question. :-). Since this function has a return point at x==25

, that is, a gap in the derivative, the optimization algorithm is most likely "wrong" about convergence. Oddly enough, taking advantage of the fact that this integrand very quickly approaches zero, the result is better if not integrated into +/-Inf

. Actually:

Rgames> integrate(f,20,30)
0.2 with absolute error < 1.9e-06
Rgames> integrate(f,22,27)
0.2 with absolute error < 8.3e-07
Rgames> integrate(f,0,50)
0.2 with absolute error < 7.8e-05

Hong ooi · Answer 2 · 2014-09-25T03:36:01+0000

In general, the advice is to ?integrate

explicitly indicate +/- Inf

, since the limits are valid, in special cases this may not be correct. This is one of them.

> integrate(f, 20, 30)
0.2 with absolute error < 1.9e-06

The main problem is that your function is not smooth because its derivative breaks at x = 25. This can trick the algorithm, in particular its use of the Wynn epsilon method to speed up convergence. Basically, there is no real alternative to understanding what your function is and how its behavior can cause problems. As pointed out in the answers here , R is not a symbolic math solver, so you need to be very careful when trying to get numerical results.

Integrate () in R gives a horribly wrong answer

More articles: