Target column encoding for classification in tensorflow
I've been working on Tensorflow for some time now, but one of the things I can't figure out is how to code the categorical target column for the model in the tf.contrib.learn model.
I know that we are defining an input function that is similar to the code below:
def input_fn(joined):
continuous_cols = {k: tf.constant(joined[k].values)
for k in CONTINUOUS_COLUMNS}
categorical_cols = {k: tf.SparseTensor(
indices=[[i, 0] for i in range(joined[k].size)],
values=joined[k].values,
dense_shape=[joined[k].size, 1])
for k in CATEGORICAL_COLUMNS}
# Merges the two dictionaries into one.
feature_cols = dict(continuous_cols.items() | categorical_cols.items())
target = tf.constant(joined[target_col].values)
return feature_cols, target
def train_input_fn():
return input_fn(train_frame)
def test_input_fn():
return input_fn(test_frame)
This is great for binary classification, or for cases where we pre-encode the Target Variable with LabelEncoder or any other method. But how do I encode this variable with tensflow so that tf.contrib.learn can accept it.
I tried to change the code for the destination column as follows:
target = tf.SparseTensor(
indices=[[i, 0] for i in range(joined[target_col].size)],
values=joined[target_col].values,
dense_shape=[joined[target_col].size, 1])
Since it is a string variable, so I thought the sparse tensor should do this But this gives an error:
ValueError: SparseTensor is not supported.
Can anyone help me in determining what I should fill in I should use in the input function for the model DNNClassifier for the target categorical variable.
source to share