Inserting data from dataframe into numpy array
I am inserting data from a dataframe df
with 55 rows into a numpy array matrix_of_coupons_and_facevalues
with shape (55,60). I am doing this using the code below. However, I am getting an error IndexError: index 55 is out of bounds for axis 0 with size 55
. months_to_maturity
contains numbers [6:6:330]
.
for (i,row) in df.iterrows():
matrix_of_coupons_and_facevalues[i,0:(row.months_to_maturity/ 6)-1] = 1/2
matrix_of_coupons_and_facevalues[i,(row.months_to_maturity/6)-1] = 3/2
thank
source to share
For future visitors, here's what happened:
The DataFrame index is used to refer to each row uniquely, so when you delete a row, that index is removed and you have a "space" in the index. This is great when you have a meaningful index. But, when you just want the index to number your lines, that's not what you want. In this case, it df
contained 55 rows, but the index had holes, so the highest index was anything greater than 55, causing an IndexError on the matrix. As an example:
In [1]: import pandas as pd
In [2]: df = pd.DataFrame([[1,2],[3,4],[5,6]], columns=['x','y'])
In [3]: df
Out[3]:
x y
0 1 2
1 3 4
2 5 6
In [4]: df = df.drop(1)
In [5]: df
Out[5]:
x y
0 1 2
2 5 6
To fix this situation, you can simply reassign the index as a list containing the correct range of numbers:
In [6]: df.index = list(range(len(df.index)))
In [7]: df
Out[7]:
x y
0 1 2
1 5 6
source to share
or you can use pandas reset_index
In [18]: df.drop(1).reset_index()
Out[18]:
index x y
0 0 1 2
1 2 5 6
source to share