Apply function on movable fragment pandas DataFrame

I would like to calculate the determinant of 2x2 matrices which are taken by rolling a 2x window on an Nx2 matrix. I am just using the determinant as an example function. In general, I would like to apply a function to a dataframe that is executed by setting a larger window.

For example, this is a single 2x2 matrix, and I calculate the determinant like this:

import pandas as pd
import numpy as np

d = pd.DataFrame({
   "X": [1,2],
   "Y": [3,4]
   })
np.linalg.det(d)

      

Now I can form 4 2x2 matrices by sliding the size 2 window along the = 0 axis of the next dataframe:

df = pd.DataFrame({
    "A": [1,2,3,4,5],
    "B": [6,7,8,9,10],
  })

      

which looks like this:

    A   B
0   1   6
1   2   7
2   3   8
3   4   9
4   5   10

      

so I would get [-5., -5., -5., -5.]

As far as I can see, pandas.DataFrame.rolling and roll.apply can only be applied to 1D vector, not dataframe? How do you do it?

+3


source to share


4 answers


#You can replace np.linalg.det with other functions as you like.
#use apply to get 'A' and 'B' from current row and next row and feed them into the function.
df.apply(lambda x: np.linalg.det(df.loc[x.name:x.name+1, 'A':'B']) if x.name <(len(df)-1) else None,axis=1)

Out[157]: 
0   -5.0
1   -5.0
2   -5.0
3   -5.0
4    NaN
dtype: float64

      



+4


source


Extract the numpy array from your dataframe:

>>> array = df.values
>>> array
array([[ 1,  6],
       [ 2,  7],
       [ 3,  8],
       [ 4,  9],
       [ 5, 10]])

      

Use numpy function as_strided

to create a sliding window:



>>> from numpy.lib.stride_tricks import as_strided

>>> rows, cols = array.shape
>>> row_stride, col_stride = array.strides
>>> windowed_array = as_strided(
...     array,
...     shape=(rows - 2 + 1, 2, cols),
...     strides=(row_stride, row_stride, col_stride))
>>> windowed_array
array([[[ 1,  6],
        [ 2,  7]],

       [[ 2,  7],
        [ 3,  8]],

       [[ 3,  8],
        [ 4,  9]],

       [[ 4,  9],
        [ 5, 10]]])

      

Now, apply your function to the resulting array:

>>> np.linalg.det(windowed_array)
array([-5., -5., -5., -5.])

      

+4


source


Use the list to make your own rental:

s = pd.Series([np.linalg.det(df.iloc[i:i+2]) for i in range(df.shape[0]-1)])

      

Output:

0   -5.0
1   -5.0
2   -5.0
3   -5.0
dtype: float64

      

+2


source


This question has been asked before. However, in your case it would be an easy way:

df['A'] * df['B'].shift(-1) - df['A'].shift(-1) * df['B']

      

Output:

0   -5.0
1   -5.0
2   -5.0
3   -5.0
4    NaN

      

+1


source







All Articles