Python (Pandas) fill cells with spaces
I am using Python (Pandas) to manage high frequency data. Basically, I need to fill in empty cells.
If this line is empty, then this line will be filled with the previous existing observation.
My original data example:
Time bid ask
15:00 . .
15:00 . .
15:02 76 .
15:02 . 77
15:03 . .
15:03 78 .
15:04 . .
15:05 . 80
15:05 . .
15:05 . .
needs to be converted to
Time bid ask
15:00 . .
15:00 . .
15:02 76 .
15:00 76 77
15:00 76 77
15:00 78 77
15:00 78 77
15:00 78 80
15:05 78 80
15:05 78 80
This is my code:
#Import
tan=pd.read_csv('sample.csv')
#From here fill the blank cells
first_line = True
mydata = []
with open(tan, 'rb') as f:
reader = csv.reader(f)
# loop through each row...
for row in reader:
this_row = row
# now do the blank-cell checking...
if first_line:
for colnos in range(len(this_row)):
if this_row[colnos] == '':
this_row[colnos] = 0
first_line = False
else:
for colnos in range(len(this_row)):
if this_row[colnos] == '':
this_row[colnos] = prev_row[colnos]
mydata.append( [this_row] )
prev_row = this_row
However, the code doesn't work.
The system indicates:
TypeError: coercing to Unicode: need string or buffer, DataFrame found
I am very grateful if you can help me solve this problem. Thank.
+3
source to share
2 answers
Use the property fillna()
. You can specify the method as forward fill
follows
import pandas as pd
data = pd.read_csv('sample.csv')
data = data.fillna(method='ffill') # This one forward fills all the columns.
# You can also apply to specific columns as below
# data[['bid','ask']] = data[['bid','ask']].fillna(method='ffill')
print data
Time bid ask
0 15:00 NaN NaN
1 15:00 NaN NaN
2 15:02 76 NaN
3 15:02 76 77
4 15:03 76 77
5 15:03 78 77
6 15:04 78 77
7 15:05 78 80
8 15:05 78 80
9 15:05 78 80
+5
source to share