Keras Share Poor LSTM Performance
I'm new to deep learning and Keras in particular. I am using dataset
at https://github.com/brmson/dataset-sts/blob/master/data/sts/sick2014/SICK_train.txt to simulate a categorical variable
The data is very simple: two sentences and a categorical goal. I also use the pre-prepared
) which you can download at http://nlp.stanford.edu/data/glove.6B.zip .
The problem is that a get ~ 65% accuracy, which I suspect is very small for this dataset. Another interesting thing: if I use the random nesting vector , I still get ~ 65% accuracy, which tells me it
doesn't matter. Here's the relevant code:
EMBED_SIZE = 100 MAX_WORDS = 1000 MAX_SEQ = 20 # Keras tokenizer, stop words were not removed! # texts contain all sentences tokenizer = Tokenizer(num_words = MAX_WORDS) tokenizer.fit_on_texts(texts) # q1 and q2 are the individual sentences word_index = tokenizer.word_index q1 = pad_sequences(tokenizer.texts_to_sequences(q1), maxlen = MAX_SEQ) q2 = pad_sequences(tokenizer.texts_to_sequences(q2), maxlen = MAX_SEQ) labels = to_categorical(labels) # load glove and create embedding matrix embeddings_index = load_embed() embedding_matrix = get_embedding_matrix(word_index, embeddings_index) I1 = Input(shape = (MAX_SEQ, ), dtype='int32') I2 = Input(shape = (MAX_SEQ, ), dtype='int32') embedding_layer = Embedding(input_dim = len(word_index) + 1, output_dim = EMBED_SIZE, weights = [embedding_matrix], input_length = MAX_SEQ, trainable = False) o1 = embedding_layer(I1) o2 = embedding_layer(I2) shared_lstm = LSTM(20, return_sequences=False) l1 = shared_lstm(o1) l2 = shared_lstm(o2) merged_vector = concatenate([l1, l2]) predictions = Dense(10)(merged_vector) predictions = Dense(len(labels), activation='softmax')(predictions) model = Model(inputs=[I1, I2], outputs=predictions) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) model.fit([q1, q2], labels, epochs=50, batch_size=5, validation_split = 0.3)
The idea is to pass
in the same Embedding + LSTM, then send the combined output to a layer of Dense.
Is there something wrong with the code? I tried to add more layers, but it doesn't help. I appreciate any advice on this, thanks!
To be sure, I checked the share of words from the text present in the attachment
: 93.38%. So this is not a problem.
I just did the following experiment: For each sentence, extract the average embedding vector. Take the difference between
and use it as a vector function for a simple
nnet. The accuracy reaches ~ 71%. So it seems like my embedding scales are messed up, even the hard ones I set
This article http://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/view/12195/12023 shows some of the results for this particular dataset. Decent accuracy should be around 0.81.
source to share
No one has answered this question yet
Check out similar questions: