Pandas SparseDataFrame from dicts list

I'm trying to convert a Python dicts list to Pandas DataFrame

. Since each dict has different keys, it takes up too much memory. Since most values ​​are NaN, a should be useful in this case SparseDataFrame

.

import pandas

df = pandas.DataFrame(keyword_data).to_sparse(fill_value=.0)

      

This works, but it takes up a lot of memory because the DataFrame is being created at the same time and sometimes it is MemoryError

.

Is it possible to create a SparseDataFrame with this data without this step? The Pandas documentation is of little help in this case ... Doing this:

pandas.SparseDataFrame(keyword_data, default_fill_value=.0)

      

Raises:

TypeError: ufunc 'isnan' is not supported for input types and inputs cannot be safely bound to any supported types according to the casting rule `` safe ''


The data looks something like this:

[{'a': 0.672366,
  'b': 0.667276,
  # ...
 },
 {'c': 0.507752,
  'd': 0.532593,
  'e': 0.507793
  # ...
 },
 # ...
]

      

Keys are always strings, with different dictaphone keys, values ​​are floating point.

Is there a way to create SparseDataFrame

directly from this data without going through a regular one DataFrame

?

+3


source to share





All Articles