Pct_change method not working in pandas dataframe

Using the code below, I am trying to check the percentage change of numeric columns:

import pandas as pd

df = pd.read_csv('./data.txt')
df.pct_change(1)

      

data.txt:

,AAPL,MSFT,^GSPC
2000-01-03,3.625643,39.33463,1455.219971
2000-01-04,3.319964,38.0059,1399.420044
2000-01-05,3.3685480000000005,38.406628000000005,1402.109985
2000-01-06,3.077039,37.12008,1403.449951

      

But the above code returns an error:

/opt/conda/lib/python3.5/site-packages/pandas/core/ops.py in na_op(x, y)
   1187                 if np.prod(xrav.shape) and np.prod(yrav.shape):
   1188                     with np.errstate(all='ignore'):
-> 1189                         result[mask] = op(xrav, yrav)
   1190             elif hasattr(x, 'size'):
   1191                 result = np.empty(x.size, dtype=x.dtype)

TypeError: unsupported operand type(s) for /: 'str' and 'str'

      

How do I use the pct_change method? Remove the non-null column (in this case the date column) rerun pct_change and then re-merge the data column?

+3


source to share


1 answer


The first column of dates are strings. df.pct_change(1)

calls a TypeError

when it tries to divide by those lines.

One way to avoid the error is to specify dates when parsing the CSV:

import pandas as pd

df = pd.read_csv('./data.txt', index_col=[0])
print(df.pct_change(1))

      

gives

                AAPL      MSFT     ^GSPC
2000-01-03       NaN       NaN       NaN
2000-01-04 -0.084310 -0.033780 -0.038345
2000-01-05  0.014634  0.010544  0.001922
2000-01-06 -0.086538 -0.033498  0.000956

      




You can parse date strings as dates as well:

df = pd.read_csv('./data.txt', index_col=[0], parse_dates=[0])

      

Then the index will be DatetimeIndex

instead of simple Index

(strings). This will allow you to do datetime arithmetic by index, and interpolate values ​​based on time .

+6


source







All Articles