Pandas mixed mode dataframe cannot serialize in hdf5?

In Pandas, it seems that I cannot store a mixed-type dataframe:

store = HDFStore('play.h5')
df = DataFrame([{'a': 1, 'b': 'hello'}, {'a': 5, 'b': 'world'}])
store.put('df', df, table=True, compression='zlib')

      

This gives Exception: Cannot currently store mixed-type DataFrame objects in Table format

Is this due to some inherent limitation of Pandas, or is it just future-friendly? It looks like HDFStore

it won't be very useful with this limitation, since a lot of the data will be of mixed types.

+3


source to share


1 answer


The table format stores all data in record form, i.e. all values ​​are stored in one column. There is an alternative table format that can be used (one column per DataFrame column) but I haven't implemented it yet. Basically the table format is designed to support queries



Mixed DataFrame can be preserved if you use table = False. More work on these features would be welcome.

+3


source







All Articles