Python pandas Reading specific values โ€‹โ€‹from HDF5 files using read_hdf and HDFStore.select

So, I created an hdf5 file with a simple dataset that looks like

>>> pd.read_hdf('STORAGE2.h5', 'table')
   A  B
0  0  0
1  1  1
2  2  2
3  3  3
4  4  4

      

Using this script

import pandas as pd
import scipy as sp
from pandas.io.pytables import Term

store = pd.HDFStore('STORAGE2.h5')

df_tl = pd.DataFrame(dict(A=list(range(5)), B=list(range(5))))

df_tl.to_hdf('STORAGE2.h5','table',append=True)

      

I know that I can select columns using

x = pd.read_hdf('STORAGE2.h5', 'table',  columns=['A'])

      

or

x = store.select('table', where = 'columns=A')

      

How would I select all values โ€‹โ€‹in column "A" that are 3 or specific or pointing to rows in column "A" like "foo"? In pandas frames, I would use df[df["A"]==3]

eitherdf[df["A"]=='foo']

Also does it make a difference in efficiency if I use read_hdf()

or store.select()

?

+2


source to share


1 answer


You need to specify data_columns=

(you can use True

to make all columns searchable)

(FYI, mode='w'

will run the file and will be just for my example)



In [50]: df_tl.to_hdf('STORAGE2.h5','table',append=True,mode='w',data_columns=['A'])

In [51]: pd.read_hdf('STORAGE2.h5','table',where='A>2')
Out[51]: 
   A  B
3  3  3
4  4  4

      

+2


source







All Articles