How to speed up a simple Pandas for / if loop?
I have a fairly simple loop that works great but takes much longer than I think it should (~ 5 minutes).
for i in range(len(df)):
if pd.isnull(df['Date'][i]):
df['Date'][i] = df['Date'][i-1]
The goal here is to capture the date and time in a data file I have that is structured where the first line for each day has text for the date, but all others are blank. I'm just looking to find out if the value is null or not, and if so, set it to the previous value.
Is there a more Pandas -y way to do this more efficiently?
Thanks Ben
+3
source to share
1 answer
Use Forward Fill ffill
df.Date.ffill(inplace=True)
Demo
df = pd.DataFrame(dict(
Date=['Wed', None, None, 'Thr', None, None],
Time=[1, 2, 3, 4, 5, 6]
))
df
Date Time
0 Wed 1
1 None 2
2 None 3
3 Thr 4
4 None 5
5 None 6
Then
df.Date.ffill(inplace=True)
df
Date Time
0 Wed 1
1 Wed 2
2 Wed 3
3 Thr 4
4 Thr 5
5 Thr 6
+4
source to share