Count if: the task is in a certain time interval

Question

Count if: the task is in a certain time interval

I have a dataframe df1 that contains three columns:

No.     Start Time          End Time
1       07/28/15 08:03 AM   07/28/15 08:09 AM
2       07/28/15 08:06 AM   07/28/15 08:12 AM

The start and end time represents the start and end of a specific task. I want to build a new framework that counts the number of active jobs at a specific time on a specific day. Like this:

Hours   Number of tasks
0:00    
0:01    
..  
..  
11:59

This data frame should display every minute of the day how many jobs are active. Work that starts at 8:03 am and ends at 8:09 am should count towards the following points: (Because it ends at 8:09 am and is no longer active at 8:09 am)

8:03
8:04
8:05
8:06
8:07
8:08

How can I do this in an easy way?

+3

python pandas

F1990 Jul 29. 15 at 7:58

source to share

1 answer

Cyrbil · Accepted Answer · 2015-07-29T08:13:26+0000

Not a pandas solution, but you can loop and filter.
Fast approximate base per hour:

import datetime

jobs = [
    (datetime.datetime(15, 7, 28, 8, 3), datetime.datetime(15, 7, 28, 8, 9)),
    (datetime.datetime(15, 7, 28, 8, 3), datetime.datetime(15, 7, 28, 8, 58)),
    (datetime.datetime(15, 7, 28, 8, 3), datetime.datetime(15, 7, 28, 10, 3)),
    (datetime.datetime(15, 7, 28, 8, 3), datetime.datetime(15, 7, 28, 9, 3)),
    (datetime.datetime(15, 7, 28, 10, 3), datetime.datetime(15, 7, 28, 8, 3)),
]
data = {'hours': [], 'active_jobs': []}
for hour in range(24):
    current__active_jobs = 0
    for job in jobs:
        if job[0].hour == hour:
            current__active_jobs += 1
        elif job[0].hour < hour and job[1].hour >= hour:
            current__active_jobs += 1
    data['hour'].append(hour)
    data['active_jobs'].append(current__active_jobs)

print DataFrame(data)

Count if: the task is in a certain time interval

More articles: