Formatting the timedelta64 string output

Question

Formatting the timedelta64 string output

In a similar vein question , I have a column numpy.timedelta64

in a pandas DataFrame. According to this answer to the above question, there is a function pandas.tslib.repr_timedelta64

that displays the timedelta perfectly in days, hours: minutes: seconds. I would like to format them only after a few days and days.

So, I have the following:

def silly_format(hours):
    (days, hours) = divmod(hours, 24)
    if days > 0 and hours > 0:
        str_time = "{0:.0f} d, {1:.0f} h".format(days, hours)
    elif days > 0:
        str_time = "{0:.0f} d".format(days)
    else:
        str_time = "{0:.0f} h".format(hours)
    return str_time

df["time"].astype("timedelta64[h]").map(silly_format)

which gets me the output I want, but I was wondering if there is a function in numpy

or pandas

similar datetime.strftime

that can format numpy.timedelta64

according to the provided format string?

I tried to adapt @ Jeff's solution further, but it is slower than my answer. Here he is:

days = time_delta.astype("timedelta64[D]").astype(int)
hours = time_delta.astype("timedelta64[h]").astype(int) % 24
result = days.astype(str)
mask = (days > 0) & (hours > 0)
result[mask] = days.astype(str) + ' d, ' + hours.astype(str) + ' h'
result[(hours > 0) & ~mask] = hours.astype(str) + ' h'
result[(days > 0) & ~mask] = days.astype(str) + ' d'

+3

python numpy pandas timedelta

Midnighter 15 Aug 14 at 11:38

source to share

3 answers

Here's how to do it in vector.

In [28]: s = pd.to_timedelta(range(5),unit='d') + pd.offsets.Hour(3)

In [29]: s
Out[29]: 
0   0 days, 03:00:00
1   1 days, 03:00:00
2   2 days, 03:00:00
3   3 days, 03:00:00
4   4 days, 03:00:00
dtype: timedelta64[ns]

In [30]: days = s.astype('timedelta64[D]').astype(int)

In [31]: hours = s.astype('timedelta64[h]').astype(int)-days*24

In [32]: days
Out[32]: 
0    0
1    1
2    2
3    3
4    4
dtype: int64

In [33]: hours
Out[33]: 
0    3
1    3
2    3
3    3
4    3
dtype: int64

In [34]: days.astype(str) + ' d, ' + hours.astype(str) + ' h'
Out[34]: 
0    0 d, 3 h
1    1 d, 3 h
2    2 d, 3 h
3    3 d, 3 h
4    4 d, 3 h
dtype: object

If you want exactly the same as OP:

In [4]: result = days.astype(str) + ' d, ' + hours.astype(str) + ' h'

In [5]: result[days==0] = hours.astype(str) + ' h'

In [6]: result
Out[6]: 
0         3 h
1    1 d, 3 h
2    2 d, 3 h
3    3 d, 3 h
4    4 d, 3 h
dtype: object

+1

Jeff 15 Aug 14 at 12:39

source to share

I don't know how this is done in pandas, but here's my numpy-only method for your problem:

import numpy as np
t = np.array([200487900000000,180787000000000,400287000000000,188487000000000], dtype='timedelta64[ns]')

days = t.astype('timedelta64[D]').astype(np.int32) # gives: array([2, 2, 4, 2], dtype=int32)
hours = t.astype('timedelta64[h]').astype(np.int32)%24 # gives: array([ 7,  2, 15,  4], dtype=int32)

So, I just convert the raw data to the desired output type (let numpy do it), then we have two arrays with data and can be used as we like. To group them in pairs, just do:

>>> np.array([days, hours]).T
array([[ 2,  7],
       [ 2,  2],
       [ 4, 15],
       [ 2,  4]], dtype=int32)

For example:

for row in d:
    print('%dd %dh' % tuple(row))

gives:

2d 7h
2d 2h
4d 15h
2d 4h

0

sebix 15 Aug 14 at 12:35

source to share

Midnighter · Accepted Answer · 2014-08-15T13:35:12+0000

While the answers provided by @sebix and @Jeff show a good way to convert timedeltas to days and hours, and @Jeff's solution in particular preserves the index Series

', they lacked the flexibility of final formatting the string. Now I am using the following solution:

def delta_format(days, hours):
    if days > 0 and hours > 0:
        return "{0:.0f} d, {1:.0f} h".format(days, hours)
    elif days > 0:
        return "{0:.0f} d".format(days)
    else:
        return "{0:.0f} h".format(hours)

days = time_delta.astype("timedelta64[D]")
hours = time_delta.astype("timedelta64[h]") % 24
return [delta_format(d, h) for (d, h) in izip(days, hours)]

which suits me and I return the index by inserting that list into the original one DataFrame

.

Formatting the timedelta64 string output

More articles: