Pytables. How do I iterate over unique values?
I have a dataset in Pytables that looks something like
class myData(IsDescription):
date = StringCol(16)
item = Int32Col()
I have multiple items in one date, for example:
'2010-01-01', 5
'2010-01-01', 6
'2010-01-02', 7
'2010-01-02', 8
Is there a way to iterate over unique dates and then items in the date? I mean something like
for date in DATE
print date
for ITEM
print item
source to share
I am not familiar with the inner workings of Pytables (so it may not be in line with what you are looking for), but the function groupby
in the module is itertools
very useful in these types (note the sorting step below - this is important in this case to get groupby
to grouping all items with the same date.See here for more information.):
In [1]: from itertools import groupby
In [2]: from operator import attrgetter
In [3]: class myData(object):
def __init__(self, date, item):
self.date = date
self.item = item
...:
In [4]: l = [myData('2012-01-01', 'thing'), myData('2012-01-01', 'another thing'), myData('2013-01-01', 'and another')]
In [5]: l_sorted = sorted(l, key=attrgetter('date'))
In [6]: for date, my_objects in groupby(l_sorted, key=attrgetter('date')):
...: print date
...: for obj in my_objects:
...: print obj.item
...:
2012-01-01
thing
another thing
2013-01-01
and another
The main template here is to get a list / container that contains the objects you want to group. Then you sort that list based on the attribute that we'll later group with (in this case date
). Then you pass this sorted list to a function groupby
that will emit two values in each iteration - a key
, which represents the value with which you are grouped (so that date
each group will be here ) and group
which contains all of your objects that use the same key date
. You can then iterate over that group, pulling the attribute item
for each object.
source to share