Why doesn't gensim Word2Vec recognize the compute_loss keyword?

Question

Why doesn't gensim Word2Vec recognize the compute_loss keyword?

According to the gensim.models.Word2Vec API reference , "compute_loss" is a valid keyword. However, I am getting an error unexpected keyword

.

UPDATE

The Word2Vec class on GitHub has the 'compute_loss' keyword, but my local library does not. I can see that the documentation and gensim library are diverging from each other. I found that the win-64/gensim-2.2.0-np113py35_0.tar.bz2

file in the conda repository is not updated.

However, after removing gensim from conda, pip install gensim

didn't change anything as it still doesn't work.

Apparently the source on GitHub and the distributed library are different, but the tutorial seems to assume the code is the same as on GitHub.

/ END UPDATE

I followed and downloaded the tutorial on Word2Vec .

In the input field [25], the first cell after the heading "Learning Loss", I get an error in the Word2Vec class' initializer.

Input:

# instantiating and training the Word2Vec model
model_with_loss = gensim.models.Word2Vec(sentences, min_count=1, 
compute_loss=True, hs=0, sg=1, seed=42)

# getting the training loss value
training_loss = model_with_loss.get_latest_training_loss()
print(training_loss)

Output:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-25-c2933abf4b08> in <module>()
      1 # instantiating and training the Word2Vec model
----> 2 model_with_loss = gensim.models.Word2Vec(sentences, min_count=1, compute_loss=True, hs=0, sg=1, seed=42)
      3 
      4 # getting the training loss value
      5 training_loss = model_with_loss.get_latest_training_loss()

TypeError: __init__() got an unexpected keyword argument 'compute_loss'

I have gensim 2.2.0 installed via conda and a fresh new clone from the gensim repository (with a study laptop). I am using 64-bit Python 3.5.3 on Windows 10. (Anaconda)

I tried to look for others with the same encounter, but I was not successful.

Do you know the reason for this and how to fix it? Apparently the source on GitHub and the distributed library are different, but the tutorial seems to assume the code is the same as on GitHub.

I also posted a question on the official mailing list earlier .

+3

python gensim word2vec

Thomas fauskanger Jul 24 17 at 8:47

source to share

1 answer

Thomas fauskanger · Accepted Answer · 2017-07-24T10:56:37+0000

UPDATE: compute_loss

was added in version 2.3.0, July 25th. / UPDATE

The notebook mentioned in the question is in the Develop branch . The master branch has a notebook that corresponds to the latest distribution.

The parameter compute_loss

was added to this commit on June 19th. The last upload to PYPI was on June 21st, just two days later. (Today). compute_loss

not included in the distribution. (Last commit in v2.2.0 this .)

I guess the solution is to wait for the next version of gensim and download the code from the repository on average.

However, this can cause problems to get the FAST version of gensim to work, at least on Windows. See Using Gensim Readings "Slow version of gensim.models.doc2vec in use" .

How to install gensim from GitHub is explained in the install documentation .

Why doesn't gensim Word2Vec recognize the compute_loss keyword?

More articles: