Pandas extension rolls out to beta regression

Hey. I am trying to calculate regression rates for an expanding window in pandas. I have the following function to calculate beta

  def beta(row, col1, col2):
      return numpy.cov(row[col1],row[col2]) / numpy.var(row[col1])

      

And I tried the following to get an expanding beta on my dataframe df

pandas.expanding_apply(df, beta, col1='col1', col2='col2')
pandas.expanding_apply(df, beta, kwargs={'col1':'col1', 'col2':'col2'})
df.expanding.apply(...)

      

However, none of them work, I either get something that says no kwargs are being passed, or if I hardcode the column names in the function beta

I get

*** IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

      

thank

Example:

def beta(row, col1, col2):
    return numpy.cov(row[col1],row[col2]) / numpy.var(row[col1])
df = pandas.DataFrame({'a':[1,2,3,4,5],'b':[.1,5,.3,.5,6]})
pandas.expanding_apply(compute_df, beta, col1='a', col2='b')
pandas.expanding_apply(compute_df, beta, kwargs={'col1':'a', 'col2':'b'})

      

Both return errors

+3


source to share


1 answer


I ran into this issue while trying to compute beta for rolling multiple regression, very similar to what you are doing (see here ). The main problem is that when Expanding.apply(func, args=(), kwargs={})

the func

param

It is necessary to create one value from the input ndarray * args and ** kwargs are passed to the function

[ source ]



And there is no way to post with expanding.apply

. (Note: expanding_apply

deprecated as mentioned .)

Below is a workaround. It's a more expensive computing device (eats up memory) but will lead you to the exit. It creates a list of NumPy spreaders and then calculates the beta for each one.

from pandas_datareader.data import DataReader as dr
import numpy as np
import pandas as pd

df = (dr(['GOOG', 'SPY'], 'google')['Close']
      .pct_change()
      .dropna())

# i is the asset, m is market/index
# [0, 1] grabs cov_i,j from the covar. matrix
def beta(i, m):
    return np.cov(i, m)[0, 1] / np.var(m)

def expwins(x, min_periods):
    return [x[:i] for i in range(min_periods, x.shape[0] + 1)]

# Example:
# arr = np.arange(10).reshape(5, 2)
# print(expwins(arr, min_periods=3)[1]) # the 2nd window of the set
# array([[0, 1],
       # [2, 3],
       # [4, 5],
       # [6, 7]])

min_periods = 21
# Create "blocks" of expanding windows
wins = expwins(df.values, min_periods=min_periods)
# Calculate a beta (single scalar val.) for each
betas = [beta(win[:, 0], win[:, 1]) for win in wins]
betas = pd.Series(betas, index=df.index[min_periods - 1:])

print(betas)
Date
2010-02-03    0.77572
2010-02-04    0.74769
2010-02-05    0.76692
2010-02-08    0.74301
2010-02-09    0.74741
2010-02-10    0.74635
2010-02-11    0.74735
2010-02-12    0.74605
2010-02-16    0.78521
2010-02-17    0.77619
2010-02-18    0.79188
2010-02-19    0.78952

2017-06-19    0.97387
2017-06-20    0.97390
2017-06-21    0.97386
2017-06-22    0.97387
2017-06-23    0.97391
2017-06-26    0.97389
2017-06-27    0.97482
2017-06-28    0.97508
2017-06-29    0.97594
2017-06-30    0.97584
2017-07-03    0.97575
2017-07-05    0.97588
dtype: float64

      

+1


source







All Articles