Strange algorithm error
I wrote this function which calculates the MLE from the Cauchy distribution numerically based on the Newton-Raphson algorithm:
mlec <- function(x,theta0=median(x),numstp=100,eps=0.01){
numfin <- numstp
ic <- 0
istop <- 0
while(istop==0){
ic <- ic+1
ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
lprimetheta <- -2*(sum(2*(x-theta0)^2/
(1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
theta1 <- theta0-(ltheta/lprimetheta)
check <- abs((theta1-theta0)/theta1)
if(check < eps ) { istop <- 1 }
theta0 <- theta1
}
list(theta1=theta1,check=check,realnumstps=ic)
}
The goal is to generate observations from a Cauchy distribution with a scale parameter of 2 and see how the MLE works. The problem is that while MLE works fine for some samples for others, I am getting a strange error
Error in if (check < eps) { : missing value where TRUE/FALSE needed
What's going on here? I defined what a "check" is so that this doesn't happen.
Thank.
source to share
I added some instrumentation (see the instruction cat()
in the middle) and fixed the second derivative expression ( fixed
: see below):
mlec <- function(x,theta0=median(x),numstp=100,eps=0.01,
debug=TRUE,fixed=FALSE){
numfin <- numstp
ic <- 0
istop <- 0
while(istop==0){
ic <- ic+1
ltheta <- -2*sum((x-theta0)/(1+(x-theta0)^2))
lprimetheta <- -2*(sum(2*(x-theta0)^2/
(1+(x-theta0)^2)^2-1/(1+(x-theta0)^2)^2))
if (!fixed) {
theta1 <- theta0-(ltheta/lprimetheta)
} else theta1 <- theta0-ltheta/ff(theta0)
check <- abs((theta1-theta0)/theta1)
if (debug) cat(ic,ltheta,lprimetheta,theta0,theta1,check,"\n")
if(check < eps ) { istop <- 1 }
theta0 <- theta1
}
list(theta1=theta1,check=check,realnumstps=ic)
}
set.seed(1)
x <- rcauchy(100,2)
mlec(x)
Here's the end of the exit tail:
## ic ltheta lprimetheta theta0 theta1 check
## 427 -4.48838e-75 -2.014555e-151 -4.455951e+76 -6.683926e+76 0.3333333
## 428 -2.992253e-75 -8.953579e-152 -6.683926e+76 -1.002589e+77 0.3333333
## 429 -1.994835e-75 -3.979368e-152 -1.002589e+77 -1.503883e+77 0.3333333
## 430 -1.32989e-75 0 -1.503883e+77 -Inf NaN
Now why is this happening? Either you have a bug somewhere or the algorithm is unstable. tl; dr turns out that the answer is actually "both"; your expression of the second derivative seems to be wrong, but even that was correct, the NR algorithm is extremely unstable for this problem.
Here are your derivatives and second derivative functions (I wrap them with Vectorize()
for convenience, so I can evaluate them at multiple values theta
):
lthetafun <- Vectorize(function(theta) {
-2*sum((x-theta)/(1+(x-theta)^2))
})
lprimethetafun <- Vectorize(function(theta) {
-2*(sum(2*(x-theta)^2/
(1+(x-theta)^2)^2-1/(1+(x-theta)^2)^2))
})
Negative log likelihood function based on built-in function dcauchy
:
thetafun <- Vectorize(function(theta) -sum(dcauchy(x,theta,log=TRUE)))
Check differentiation (at an arbitrary point):
library("numDeriv")
all.equal(grad(thetafun,2),lthetafun(2)) ## TRUE
Check the second derivative:
hessian(thetafun,2) ## 36.13297
lprimethetafun(2) ## 8.609859: ???
I think your second derived expression is wrong.
The following alternative second derivative function is based on lazy trickery with Wolfram Alpha by differentiating your gradient function (which is the same as the finite-difference approximation):
ff <- Vectorize(function(theta)
2*sum(((x-theta)^2-1)/((x-theta)^2+1)^2))
ff(2) ## matches hessian() above.
But it looks like you may have additional problems.
The negative likelihood surface looks fine:
curve(thetafun, from=-10,to=10,n=501)
But the problem is on the horizon:
curve(lthetafun, from=-10,to=10, n=501)
It looks irregular and, going up one level into the second derivative, shows that it is:
curve(ff, from=-10, to=10, n=501)
Here's the NR curve being updated:
ff2 <- function(x) x-lthetafun(x)/ff(x)
curve(ff2, from=-10, to=10, n=501,ylim=c(-100,100))
Clap! This points to why the Newton-Raphson method might go wrong if you don't get close to the minimum (anytime the likelihood surface has an inflection point, the NR update curve will have a pole ...). Further analysis of the problem will probably tell you why the second Cauchy derivative is so scary.
If you just want to find MLE, you can do it with a more robust one-dimensional method:
library("bbmle")
mle2(x~dcauchy(location=m),
data=data.frame(x),
start=list(m=median(x)),
method="Brent",
lower=-100,upper=100)
##
## Call:
## mle2(minuslogl = x ~ dcauchy(location = m), start = list(m = median(x)),
## method = "Brent", data = data.frame(x), lower = -100, upper = 100)
##
## Coefficients:
## m
## 1.90179
##
## Log-likelihood: -262.96
##
If you start close enough, NR works fine:
mlec(x,1.85,debug=FALSE,fixed=TRUE,eps=0.0001)
## $theta1
## [1] 1.901592
##
## $check
## [1] 5.214763e-05
##
## $realnumstps
## [1] 37079
source to share