Pandas time comparison

Question

Pandas time comparison

I want to get the average difference between the overlapping parts of two time series. However, both their ranges and their intervals are different. What's the best way to solve these two problems?

Sample data:

Series 1:
2014-08-05 05:03:00    25.194      
2014-08-05 05:08:00    25.196      
2014-08-05 05:13:00    25.197      
2014-08-05 05:18:00    25.199      
2014-08-05 05:23:00    25.192      

Series 2:
2014-08-05 05:12:00    25.000000
2014-08-05 05:13:00    25.000000
2014-08-05 05:14:00    25.000000

+3

pandas time-series resampling

Mixel 15 Aug '14 at 7:49

source to share

1 answer

joris · Accepted Answer · 2014-08-15T09:17:09+0000

Is this what you are looking for?

First you can align both series (so they both have the same index. It is also possible to only reindex one of them into the index of the other with reindex

):

In [85]: s1, s2 = s1.align(s2)

In [86]: s1
Out[86]: 
2014-08-05 05:03:00    25.194
2014-08-05 05:08:00    25.196
2014-08-05 05:12:00       NaN
2014-08-05 05:13:00    25.197
2014-08-05 05:14:00       NaN
2014-08-05 05:18:00    25.199
2014-08-05 05:23:00    25.192
dtype: float64

In [87]: s2
Out[87]: 
2014-08-05 05:03:00   NaN
2014-08-05 05:08:00   NaN
2014-08-05 05:12:00    25
2014-08-05 05:13:00    25
2014-08-05 05:14:00    25
2014-08-05 05:18:00   NaN
2014-08-05 05:23:00   NaN
dtype: float64

Then you can interpolate the missing values (e.g. linear interpolation based on the time index):

In [88]: s1.interpolate(method='time')
Out[88]: 
2014-08-05 05:03:00    25.1940
2014-08-05 05:08:00    25.1960
2014-08-05 05:12:00    25.1968
2014-08-05 05:13:00    25.1970
2014-08-05 05:14:00    25.1974
2014-08-05 05:18:00    25.1990
2014-08-05 05:23:00    25.1920
dtype: float64

And then just align both rows to get the difference:

In [91]: s = s1.interpolate(method='time') - s2.interpolate(method='time')

In [92]: s
Out[92]: 
2014-08-05 05:03:00       NaN
2014-08-05 05:08:00       NaN
2014-08-05 05:12:00    0.1968
2014-08-05 05:13:00    0.1970
2014-08-05 05:14:00    0.1974
2014-08-05 05:18:00    0.1990
2014-08-05 05:23:00    0.1920
dtype: float64

In [93]: s.mean()
Out[93]: 0.19643999999999906

Pandas time comparison

More articles: