Python ARMA MLE (implementation of algorithms from literature)
Overview
I am trying to implement an Autoregressive Moving Average (ARMA) parameter optimization using maximum likelihood estimation (MLE) through a Kalman filter. I know I can put ARMA models using the statsmodels package in Python, but I want to write my own ARMA probability implementation and subsequent optimization as a prototype for a future C / C ++ implementation. Also, when I go through the statsmodels documentation, I find that the Kalman Filter Log Leyelihood statistical models implement a slightly different expression than I found in the literature.
Algorithms
To calculate ARMA log verification, I follow Pearlman's 1980 article:
Pearlman, JG "An Algorithm for the Exact Probability of a High Autoregressive Moving Average Process." Biometrika 67.1 (1980): 232-233.). Available from JSTOR.
To compute the initial matrix P, I follow the algorithm in
Gardner, G., Andrew C. Harvey, and Harry D. A. Phillips. "Algorithm AS 154: an algorithm for accurate maximum likelihood estimation of autoregressive moving average models using Kalman filtering". Journal of the Royal Statistical Society. Series C (Applied Statistics) 29.3 (1980): 311-322. Available from JSTOR.
For the initial parameter values, I am currently using an internal method that statistically simulates ARMA models to compute an initial guess for ARMA parameters. In the future, I plan to move to my own implementation, but I am using _fit_starts_params while I debug my MLE.
For MLE optimization, I just use the L-BFGS solver in Scipy.
code
import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.arima_model
import scipy.optimize
class ARMA(object):
def __init__(self, endo, nar, nma):
np.random.seed(0)
# endogenous variables
self.endo = endo
# Number of AR terms
self.nar = nar
# Number of MA terms
self.nma = nma
# "Dimension" of the ARMA fit
self.dim = max(nar, nma+1)
# Current ARMA parameters
self.params = np.zeros(self.nar+self.nma, dtype='float')
def __g(self, ma_params):
'''
Build MA parameter vector
'''
g = np.zeros(self.dim, dtype='float')
g[0] = 1.0
if self.nma > 0:
g[1:self.nma+1] = ma_params
return g
def __F(self, ar_params):
'''
Build AR parameter matrix
'''
F = np.zeros((self.dim, self.dim), dtype='float')
F[:self.nar, 0] = ar_params
for i in xrange(1, self.dim):
F[i-1, i] = 1.0
return F
def __initial_P(self, R, T):
'''
Solve for initial P matrix
Solves P = TPT' + RR'
'''
v = np.zeros(self.dim*self.dim, dtype='float')
for i in xrange(self.dim):
for j in xrange(self.dim):
v[i+j*self.dim] = R[i]*R[j]
R = np.array([R])
S = np.identity(self.dim**2, dtype='float')-np.kron(T, T)
V = np.outer(R, R).ravel('F')
Pmat = np.linalg.solve(S,V).reshape(self.dim, self.dim, order='F')
return Pmat
def __likelihood(self, params):
'''
Compute log likehood for a parameter vector
Implements the Pearlman 1980 algorithm
'''
# these checks are pilfered from statsmodels
if self.nar > 0 and not np.all(np.abs(np.roots(np.r_[1, -params[:self.nar]]))<1):
print 'AR coefficients are not stationary'
if self.nma > 0 and not np.all(np.abs(np.roots(np.r_[1, -params[-self.nma:]]))<1):
print 'MA coefficients are not stationary'
ar_params = params[:self.nar]
ma_params = params[-self.nma:]
g = self.__g(ma_params)
F = self.__F(ar_params)
w = self.endo
P = self.__initial_P(g, F)
n = len(w)
z = np.zeros(self.dim, dtype='float')
R = np.zeros(n, dtype='float')
a = np.zeros(n, dtype='float')
K = np.dot(F, P[:, 0])
L = K.copy()
R[0] = P[0, 0]
for i in xrange(1, n):
a[i-1] = w[i-1] - z[0]
z = np.dot(F, z) + K*(a[i-1]/R[i-1])
Kupdate = -(L[0]/R[i-1])*np.dot(F, L)
Rupdate = -L[0]*L[0]/R[i-1]
P -= np.outer(L, L)/R[i-1]
L = np.dot(F, L) - (L[0]/R[i-1])*K
K += Kupdate
R[i] = R[i-1] + Rupdate
if np.abs(R[i] - 1.0) < 1e-9:
R[i:] = 1.0
break
for j in xrange(i, n):
a[j] = w[j] - z[0]
z = np.dot(F, z) + K*(a[i-1]/R[i-1])
likelihood = 0.0
for i in xrange(n):
likelihood += np.log(R[i])
likelihood *= -0.5
ssum = 0.0
for i in xrange(n):
ssum += a[i]*a[i]/R[i]
likelihood += -0.5*n*np.log(ssum)
return likelihood
def fit(self):
'''
Fit the ARMA model by minimizing the loglikehood
Uses scipy.optimize.minimize
'''
sm_arma = statsmodels.tsa.arima_model.ARMA(endog=self.endo, order=(self.nar, self.nma, 0))
params = statsmodels.tsa.arima_model.ARMA._fit_start_params_hr(sm_arma, order=(self.nar, self.nma, 0))
opt = scipy.optimize.minimize(fun=self.__likelihood, x0=params, method='L-BFGS-B')
print opt
# Test the code on statsmodels sunspots data
nar = 2
nma = 1
endo = sm.datasets.sunspots.load_pandas().data['SUNACTIVITY'].tolist()
arma = ARMA(endo=endo, nar=nar, nma=nma)
arma.fit()
Questions
I believe the above example does not converge. On the third call to ARMA._likelihood, the code generates the following warning:
RuntimeWarning: invalid value encountered in log
likelihood += np.log(R[i])
which happens because ARMA._initial_P solves for a matrix where P [0] [0] is 0.0. At this stage, the current estimates of AR parameters become nonstationary. All subsequent iterations then warn that the AR and MA parameters are not stationary.
Questions
-
Is this implementation correct? I have verified that the original matrix P satisfies the equation it should. For calculating the likelihood, I see several behaviors that I expect to make up Pearlman's paper:
- R tends to one. For a pure AR process with parameters p AR, it reaches this limit in p steps. Basically, the break statement in the likelihood function takes effect after p iterations of Pearlman's algorithm steps.
- L tends to the zero vector.
- K tends to Fg I check this by looking at abs (K - Fg) when calculating the probability.
- After warning about negative logarithm, the above constraints are no longer met.
I have also tried to implement ARMA parameter conversion to prevent overflow / underflow as recommended in
Jones, Richard H. "Maximum likelihood fit of ARMA models for time series with missing observations." Technometrics 22.3 (1980): 389-395. Available from JSTOR.
This transformation did not seem to affect the errors it encountered.
-
If the implementation is correct, then how should I handle negative R values? The problem arises when scipy.optimize returns a vector of parameters that matches the matrix P for which the top diagonal element is negative. Is the optimization routine supposed to be constrained to prevent negative R values? I have also tried using complex logarithms for negative values ββand also changing all numpy dtype parameters to "complex". For example:
def complex_log(val): ''' Complex logarithm for negative values Returns log(val) + I*pi ''' if val < 0.0: return complex(np.log(np.abs(val)), np.pi) return np.log(val)
However, scipy.optimize cannot handle complex-valued functions, so this supposed fix doesn't work yet. Any advice on how to prevent or handle these behaviors?
Thanks for reading this far. Any help is greatly appreciated.
source to share
No one has answered this question yet
Check out similar questions: