Using OneHotEncoder with sklearn_pandas DataFrameMapper

I am trying to use sklearn_pandas DataFrameMapper. This applies to column names along with the Preprocessing function that is required for that column. Thus,

mapper = sklearn_pandas.DataFrameMapper([


season is an int64 col in my pandas DataFrame.

This gives me the following error: Too many values ​​to unpack. I understand that OneHotEncoder accepts a 2-D sample and not a 1-D sample.

How can I use this OneHotEncoder with sklearn_pandas or is it not possible.


source to share

1 answer

The official version sklearn-pandas

has some problems when working with one-dimensional arrays and transformations. Try the following fork:

However, I think you can accomplish what you want using LabelBinarizer

(as in the examples sklearn_pandas

) instead OneHotEncoder


UPDATE 2015-11-28

As sklearn-pandas>=0.0.12

you can solve the problem by doing the following:

mapper = sklearn_pandas.DataFrameMapper([


From the docs:

The difference between specifying a column selector as 'column'

(as a simple string) and ['column']

(as a list with one element) is the forms of the array that is passed to the transformer. In the first case, a one-dimensional array to be passed, and in the second case it will be a two-dimensional array with one column, that is, a column is a vector.



All Articles