Duration of value in Pandas DataFrame

I have the following DataFrame:

            f_1    f_2    f_3
00:00:00  False  False  False
00:05:22   True  False  False
00:06:40   True  False  False
00:06:41  False  False  False
00:06:42  False  False  False
00:06:43  False  False  False
00:06:44  False  False  False
00:06:46  False  False  False
00:06:58  False  False  False

      

and I want to calculate the total duration when the series was True. In this example, the only run that went True for a while was f_1. I am currently using the following code:

result = pandas.Timedelta(0)

for _, series in falsePositives.iteritems():
    previousTime = None
    previousValue = None
    for currentTime, currentValue in series.iteritems():
        if previousValue:
            result += (currentTime - previousTime)
        previousTime = currentTime
        previousValue = currentValue

print result.total_seconds()

      

Is there a better solution? I believe there is already a method in Pandas that does this or something similar.

+3


source to share


1 answer


I think you can create Series

from index

using to_series

, difference diff

and change shift

and get last time dt.total_seconds

.

The last few booleans DataFrame

mul

and the last get sum

:



#if necessary convert index to Timedelta
df.index = pd.to_timedelta(df.index)

s = df.index.to_series().diff().shift(-1).dt.total_seconds()
df1 = df.mul(s, 0)
print (df1)
           f_1  f_2  f_3
00:00:00   0.0  0.0  0.0
00:05:22  78.0  0.0  0.0
00:06:40   1.0  0.0  0.0
00:06:41   0.0  0.0  0.0
00:06:42   0.0  0.0  0.0
00:06:43   0.0  0.0  0.0
00:06:44   0.0  0.0  0.0
00:06:46   0.0  0.0  0.0
00:06:58   NaN  NaN  NaN

print (df1.sum())
f_1    79.0
f_2     0.0
f_3     0.0
dtype: float64

      

+5


source







All Articles