Formatting the timedelta64 string output

In a similar vein question , I have a column numpy.timedelta64

in a pandas DataFrame. According to this answer to the above question, there is a function pandas.tslib.repr_timedelta64

that displays the timedelta perfectly in days, hours: minutes: seconds. I would like to format them only after a few days and days.

So, I have the following:

def silly_format(hours):
    (days, hours) = divmod(hours, 24)
    if days > 0 and hours > 0:
        str_time = "{0:.0f} d, {1:.0f} h".format(days, hours)
    elif days > 0:
        str_time = "{0:.0f} d".format(days)
    else:
        str_time = "{0:.0f} h".format(hours)
    return str_time

df["time"].astype("timedelta64[h]").map(silly_format)

      

which gets me the output I want, but I was wondering if there is a function in numpy

or pandas

similar datetime.strftime

that can format numpy.timedelta64

according to the provided format string?


I tried to adapt @ Jeff's solution further, but it is slower than my answer. Here he is:

days = time_delta.astype("timedelta64[D]").astype(int)
hours = time_delta.astype("timedelta64[h]").astype(int) % 24
result = days.astype(str)
mask = (days > 0) & (hours > 0)
result[mask] = days.astype(str) + ' d, ' + hours.astype(str) + ' h'
result[(hours > 0) & ~mask] = hours.astype(str) + ' h'
result[(days > 0) & ~mask] = days.astype(str) + ' d'

      

+3


source to share


3 answers


While the answers provided by @sebix and @Jeff show a good way to convert timedeltas to days and hours, and @Jeff's solution in particular preserves the index Series

', they lacked the flexibility of final formatting the string. Now I am using the following solution:

def delta_format(days, hours):
    if days > 0 and hours > 0:
        return "{0:.0f} d, {1:.0f} h".format(days, hours)
    elif days > 0:
        return "{0:.0f} d".format(days)
    else:
        return "{0:.0f} h".format(hours)

days = time_delta.astype("timedelta64[D]")
hours = time_delta.astype("timedelta64[h]") % 24
return [delta_format(d, h) for (d, h) in izip(days, hours)]

      



which suits me and I return the index by inserting that list into the original one DataFrame

.

+3


source


Here's how to do it in vector.

In [28]: s = pd.to_timedelta(range(5),unit='d') + pd.offsets.Hour(3)

In [29]: s
Out[29]: 
0   0 days, 03:00:00
1   1 days, 03:00:00
2   2 days, 03:00:00
3   3 days, 03:00:00
4   4 days, 03:00:00
dtype: timedelta64[ns]

In [30]: days = s.astype('timedelta64[D]').astype(int)

In [31]: hours = s.astype('timedelta64[h]').astype(int)-days*24

In [32]: days
Out[32]: 
0    0
1    1
2    2
3    3
4    4
dtype: int64

In [33]: hours
Out[33]: 
0    3
1    3
2    3
3    3
4    3
dtype: int64

In [34]: days.astype(str) + ' d, ' + hours.astype(str) + ' h'
Out[34]: 
0    0 d, 3 h
1    1 d, 3 h
2    2 d, 3 h
3    3 d, 3 h
4    4 d, 3 h
dtype: object

      



If you want exactly the same as OP:

In [4]: result = days.astype(str) + ' d, ' + hours.astype(str) + ' h'

In [5]: result[days==0] = hours.astype(str) + ' h'

In [6]: result
Out[6]: 
0         3 h
1    1 d, 3 h
2    2 d, 3 h
3    3 d, 3 h
4    4 d, 3 h
dtype: object

      

+1


source


I don't know how this is done in pandas, but here's my numpy-only method for your problem:

import numpy as np
t = np.array([200487900000000,180787000000000,400287000000000,188487000000000], dtype='timedelta64[ns]')

days = t.astype('timedelta64[D]').astype(np.int32) # gives: array([2, 2, 4, 2], dtype=int32)
hours = t.astype('timedelta64[h]').astype(np.int32)%24 # gives: array([ 7,  2, 15,  4], dtype=int32)

      

So, I just convert the raw data to the desired output type (let numpy do it), then we have two arrays with data and can be used as we like. To group them in pairs, just do:

>>> np.array([days, hours]).T
array([[ 2,  7],
       [ 2,  2],
       [ 4, 15],
       [ 2,  4]], dtype=int32)

      

For example:

for row in d:
    print('%dd %dh' % tuple(row))

      

gives:

2d 7h
2d 2h
4d 15h
2d 4h

      

0


source







All Articles