Integrate () in R gives a horribly wrong answer
I need a little longer to fully explain this and hopefully other users will add this wiki.
The off ?integrate
argument abs.tol
is defined as
absolute precision requested.
And further down the following note:
When integrating over infinite intervals, do this explicitly, rather than just using a large number as the end point. This increases the likelihood of the correct answer - any function whose integral is finite over an infinite interval should be almost zero for most of this interval.
So, if you want absolute precision, not relative precision (which is defined as the result from .Machine$double.eps^0.25
), you can do
> integrate(f, Inf, -Inf, abs.tol = 0L)
0.2 with absolute error < 8.4e-06
The default argument for abs.tol
is passed from rel.tol
, which is.Machine$double.eps^0.25
Let's see what happens inside "a little".
ifoo<-integrate(f,-Inf,Inf,abs.tol=1e-20)
5.275825e-21 with absolute error < 9.8e-21
str(ifoo)
List of 5
$ value : num 5.28e-21
$ abs.error : num 9.81e-21
$ subdivisions: int 3
$ message : chr "OK"
$ call : language integrate(f = f, lower = -Inf, upper = Inf, abs.tol = 1e-20)
- attr(*, "class")= chr "integrate"
ifoo<-integrate(f,-Inf,Inf,abs.tol=1e-40)
0.2 with absolute error < 8.4e-06
str(ifoo)
List of 5
$ value : num 0.2
$ abs.error : num 8.36e-06
$ subdivisions: int 21
$ message : chr "OK"
$ call : language integrate(f = f, lower = -Inf, upper = Inf, abs.tol = 1e-40)
- attr(*, "class")= chr "integrate"
Note the sudden jump in the number of divisions. In general, more subsections means better precision, which is, after all, a calculus point: reduce the width of the split to nothing to get the exact answer. I assume that with a large (ish) abs.tol
, only a few subsections are required to compute the calculated value for the calculated value to agree with some "estimated tolerance error", but when the required tolerance becomes small enough, more subsections are added. "
Edit: Thanks to Hong Ooi who actually looked at the integrand in question. :-). Since this function has a return point at x==25
, that is, a gap in the derivative, the optimization algorithm is most likely "wrong" about convergence. Oddly enough, taking advantage of the fact that this integrand very quickly approaches zero, the result is better if not integrated into +/-Inf
. Actually:
Rgames> integrate(f,20,30)
0.2 with absolute error < 1.9e-06
Rgames> integrate(f,22,27)
0.2 with absolute error < 8.3e-07
Rgames> integrate(f,0,50)
0.2 with absolute error < 7.8e-05
source to share
In general, the advice is to ?integrate
explicitly indicate +/- Inf
, since the limits are valid, in special cases this may not be correct. This is one of them.
> integrate(f, 20, 30)
0.2 with absolute error < 1.9e-06
The main problem is that your function is not smooth because its derivative breaks at x = 25. This can trick the algorithm, in particular its use of the Wynn epsilon method to speed up convergence. Basically, there is no real alternative to understanding what your function is and how its behavior can cause problems. As pointed out in the answers here , R is not a symbolic math solver, so you need to be very careful when trying to get numerical results.
source to share