Flux tensor sensor designation

Question

Flux tensor sensor designation

Can someone explain to me what the Tensorflow BoW Encoder does / returns? I would expect to get a vector of word counts per document (e.g. in sklearn), however it seems to do something a little more quirky.

In this example:

https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/learn/text_classification.py

features = encoders.bow_encoder(
  features, vocab_size=n_words, embed_dim=EMBEDDING_SIZE)

'Embed_dim' is passed in and I also don't understand what this does in the context of BoW encoding. The documentation is unfortunately not very helpful. I could try to work through the Tensorflow code, however, I would appreciate a high level explanation.

+3

python tensorflow

Frank 26 Mar 17 at 18:45

source to share

1 answer

Paul pawletta · Accepted Answer · 2017-06-27T09:31:03+0000

In the classical BOW model, each word is represented by an identifier (sparse vectors). Bow_encoder maps these sparse vectors to another layer at the size specified by "embed_dim". bow_encoder is used to learn dense vector representation for a word or text (for example, in the word2vec model).

From tensorflow documentation about bow_encoder: Msgstr "Maps a sequence of characters to a vector for example by averaging the attachments."

Thus: If the input to bow_encoder is a single word, it is simply mapped to the inline layer. Although the sentence (or text) is displayed word by word, the final embedded vector is averaged.

Flux tensor sensor designation

More articles: