Python ARMA MLE (implementation of algorithms from literature)

Overview

I am trying to implement an Autoregressive Moving Average (ARMA) parameter optimization using maximum likelihood estimation (MLE) through a Kalman filter. I know I can put ARMA models using the statsmodels package in Python, but I want to write my own ARMA probability implementation and subsequent optimization as a prototype for a future C / C ++ implementation. Also, when I go through the statsmodels documentation, I find that the Kalman Filter Log Leyelihood statistical models implement a slightly different expression than I found in the literature.

Algorithms

To calculate ARMA log verification, I follow Pearlman's 1980 article:

Pearlman, JG "An Algorithm for the Exact Probability of a High Autoregressive Moving Average Process." Biometrika 67.1 (1980): 232-233.). Available from JSTOR.

To compute the initial matrix P, I follow the algorithm in

Gardner, G., Andrew C. Harvey, and Harry D. A. Phillips. "Algorithm AS 154: an algorithm for accurate maximum likelihood estimation of autoregressive moving average models using Kalman filtering". Journal of the Royal Statistical Society. Series C (Applied Statistics) 29.3 (1980): 311-322. Available from JSTOR.

For the initial parameter values, I am currently using an internal method that statistically simulates ARMA models to compute an initial guess for ARMA parameters. In the future, I plan to move to my own implementation, but I am using _fit_starts_params while I debug my MLE.

For MLE optimization, I just use the L-BFGS solver in Scipy.

code

import numpy as np
import statsmodels.api as sm
import statsmodels.tsa.arima_model
import scipy.optimize

class ARMA(object):
    def __init__(self, endo, nar, nma):
        np.random.seed(0)
        # endogenous variables
        self.endo = endo
        # Number of AR terms
        self.nar = nar
        # Number of MA terms
        self.nma = nma
        # "Dimension" of the ARMA fit
        self.dim = max(nar, nma+1)
        # Current ARMA parameters
        self.params = np.zeros(self.nar+self.nma, dtype='float')

    def __g(self, ma_params):
        '''
        Build MA parameter vector
        '''
        g = np.zeros(self.dim, dtype='float')
        g[0] = 1.0
        if self.nma > 0:
            g[1:self.nma+1] = ma_params
        return g

    def __F(self, ar_params):
        '''
        Build AR parameter matrix
        '''
        F = np.zeros((self.dim, self.dim), dtype='float')
        F[:self.nar, 0] =  ar_params
        for i in xrange(1, self.dim):
            F[i-1, i] = 1.0
        return F

    def __initial_P(self, R, T):
        '''
        Solve for initial P matrix

        Solves P = TPT' + RR'
        '''
        v = np.zeros(self.dim*self.dim, dtype='float')
        for i in xrange(self.dim):
            for j in xrange(self.dim):
                v[i+j*self.dim] = R[i]*R[j]
        R = np.array([R])
        S = np.identity(self.dim**2, dtype='float')-np.kron(T, T)
        V = np.outer(R, R).ravel('F')
        Pmat = np.linalg.solve(S,V).reshape(self.dim, self.dim, order='F')
        return Pmat

    def __likelihood(self, params):
        '''
        Compute log likehood for a parameter vector

        Implements the Pearlman 1980 algorithm
        '''
        # these checks are pilfered from statsmodels
        if self.nar > 0 and not np.all(np.abs(np.roots(np.r_[1, -params[:self.nar]]))<1):
            print 'AR coefficients are not stationary'
        if self.nma > 0 and not np.all(np.abs(np.roots(np.r_[1, -params[-self.nma:]]))<1):
           print 'MA coefficients are not stationary'
        ar_params = params[:self.nar]
        ma_params = params[-self.nma:]
        g = self.__g(ma_params)
        F = self.__F(ar_params)
        w = self.endo
        P = self.__initial_P(g, F)
        n = len(w)
        z = np.zeros(self.dim, dtype='float')
        R = np.zeros(n, dtype='float')
        a = np.zeros(n, dtype='float')
        K = np.dot(F, P[:, 0])
        L = K.copy()
        R[0] = P[0, 0]
        for i in xrange(1, n):
            a[i-1] = w[i-1] - z[0]
            z = np.dot(F, z) + K*(a[i-1]/R[i-1])
            Kupdate = -(L[0]/R[i-1])*np.dot(F, L)
            Rupdate = -L[0]*L[0]/R[i-1]
            P -= np.outer(L, L)/R[i-1]
            L = np.dot(F, L) - (L[0]/R[i-1])*K
            K += Kupdate
            R[i] = R[i-1] + Rupdate
            if np.abs(R[i] - 1.0) < 1e-9:
                R[i:] = 1.0
                break
        for j in xrange(i, n):
            a[j] = w[j] - z[0]
            z = np.dot(F, z) + K*(a[i-1]/R[i-1])
        likelihood = 0.0
        for i in xrange(n):
            likelihood += np.log(R[i])
        likelihood *= -0.5
        ssum = 0.0
        for i in xrange(n):
            ssum += a[i]*a[i]/R[i]
        likelihood += -0.5*n*np.log(ssum)
        return likelihood

    def fit(self):
        '''
        Fit the ARMA model by minimizing the loglikehood

        Uses scipy.optimize.minimize
        '''
        sm_arma = statsmodels.tsa.arima_model.ARMA(endog=self.endo, order=(self.nar, self.nma, 0))
        params = statsmodels.tsa.arima_model.ARMA._fit_start_params_hr(sm_arma, order=(self.nar, self.nma, 0))
        opt = scipy.optimize.minimize(fun=self.__likelihood, x0=params, method='L-BFGS-B')
        print opt

# Test the code on statsmodels sunspots data
nar = 2
nma = 1

endo = sm.datasets.sunspots.load_pandas().data['SUNACTIVITY'].tolist()
arma = ARMA(endo=endo, nar=nar, nma=nma)
arma.fit()

      

Questions

I believe the above example does not converge. On the third call to ARMA._likelihood, the code generates the following warning:

RuntimeWarning: invalid value encountered in log
  likelihood += np.log(R[i])

      

which happens because ARMA._initial_P solves for a matrix where P [0] [0] is 0.0. At this stage, the current estimates of AR parameters become nonstationary. All subsequent iterations then warn that the AR and MA parameters are not stationary.

Questions

  • Is this implementation correct? I have verified that the original matrix P satisfies the equation it should. For calculating the likelihood, I see several behaviors that I expect to make up Pearlman's paper:

    • R tends to one. For a pure AR process with parameters p AR, it reaches this limit in p steps. Basically, the break statement in the likelihood function takes effect after p iterations of Pearlman's algorithm steps.
    • L tends to the zero vector.
    • K tends to Fg I check this by looking at abs (K - Fg) when calculating the probability.
    • After warning about negative logarithm, the above constraints are no longer met.

I have also tried to implement ARMA parameter conversion to prevent overflow / underflow as recommended in

Jones, Richard H. "Maximum likelihood fit of ARMA models for time series with missing observations." Technometrics 22.3 (1980): 389-395. Available from JSTOR.

This transformation did not seem to affect the errors it encountered.

  1. If the implementation is correct, then how should I handle negative R values? The problem arises when scipy.optimize returns a vector of parameters that matches the matrix P for which the top diagonal element is negative. Is the optimization routine supposed to be constrained to prevent negative R values? I have also tried using complex logarithms for negative values ​​and also changing all numpy dtype parameters to "complex". For example:

    def complex_log(val):
        '''
        Complex logarithm for negative values
    
        Returns log(val) + I*pi
        '''
        if val < 0.0:
            return complex(np.log(np.abs(val)), np.pi)
        return np.log(val) 
    
          

However, scipy.optimize cannot handle complex-valued functions, so this supposed fix doesn't work yet. Any advice on how to prevent or handle these behaviors?

Thanks for reading this far. Any help is greatly appreciated.

+3


source to share





All Articles