Duration of value in Pandas DataFrame
I have the following DataFrame:
f_1 f_2 f_3
00:00:00 False False False
00:05:22 True False False
00:06:40 True False False
00:06:41 False False False
00:06:42 False False False
00:06:43 False False False
00:06:44 False False False
00:06:46 False False False
00:06:58 False False False
and I want to calculate the total duration when the series was True. In this example, the only run that went True for a while was f_1. I am currently using the following code:
result = pandas.Timedelta(0)
for _, series in falsePositives.iteritems():
previousTime = None
previousValue = None
for currentTime, currentValue in series.iteritems():
if previousValue:
result += (currentTime - previousTime)
previousTime = currentTime
previousValue = currentValue
print result.total_seconds()
Is there a better solution? I believe there is already a method in Pandas that does this or something similar.
source to share
I think you can create Series
from index
using to_series
, difference diff
and change shift
and get last time dt.total_seconds
.
The last few booleans DataFrame
mul
and the last get sum
:
#if necessary convert index to Timedelta
df.index = pd.to_timedelta(df.index)
s = df.index.to_series().diff().shift(-1).dt.total_seconds()
df1 = df.mul(s, 0)
print (df1)
f_1 f_2 f_3
00:00:00 0.0 0.0 0.0
00:05:22 78.0 0.0 0.0
00:06:40 1.0 0.0 0.0
00:06:41 0.0 0.0 0.0
00:06:42 0.0 0.0 0.0
00:06:43 0.0 0.0 0.0
00:06:44 0.0 0.0 0.0
00:06:46 0.0 0.0 0.0
00:06:58 NaN NaN NaN
print (df1.sum())
f_1 79.0
f_2 0.0
f_3 0.0
dtype: float64
source to share