How to enter new predictive text in keras when using inline dataset
I am looking at examples in keras and I gave an example of using LSTM to classify sentiments against the built-in imdb dataset ( https://github.com/fchollet/keras/blob/master/examples/imdb_lstm.py ).
When validating the data, each survey is depicted as an array of numbers, which I consider them to be an index from a dictionary built using that dataset.
My question, however, is how can I enter a new piece of text (something I am doing) into this model to get a prediction? How do I access this vocabulary of words?
After that, I could preprocess the input text into an array of numbers and feed it. Thank!
+3
source to share
2 answers
When predicting new text, you must follow the same step you took for training.
- Pre-process this new proposal.
- Convert text to vector using word_index
- Place the vector with the same length as during training.
- Flatten the array and pass it as input to your model.
sentences = clean_text(text)
word_index = imdb.get_word_index()
x_test = [[self.word_index[w] for w in sentences if w in self.word_index]]
x_test = pad_sequences(x_test, maxlen=maxlen) # Should be same which you used for training data
vector = np.array([x_test.flatten()])
model.predict_classes(vector)
+4
source to share