Python NLTK Maximum error of entropy classifier

I am currently using the NLTK Naive Bayes classifier, however I also wanted to try the Max Ent classifier. You can see from the documentation that it should have the same format for the feature set as Naive Bayes, but for some reason I get this error when I try:

  File "/usr/lib/python2.7/site-packages/nltk/classify/maxent.py", line 323, in train
    gaussian_prior_sigma, **cutoffs)
  File "/usr/lib/python2.7/site-packages/nltk/classify/maxent.py", line 1453, in train_maxent_classifier_with_scipy
    model.fit(algorithm=algorithm)
  File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 1026, in fit
    return model.fit(self, self.K, algorithm)
  File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 226, in fit
    callback=callback)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 636, in fmin_cg
    gfk = myfprime(x0)
  File "/usr/lib64/python2.7/site-packages/scipy/optimize/optimize.py", line 176, in function_wrapper
    return function(x, *args)
  File "/usr/lib64/python2.7/site-packages/scipy/maxentropy/maxentropy.py", line 420, in grad
    G = self.expectations() - self.K
ValueError: shape mismatch: objects cannot be broadcast to a single shape

      

I'm not sure what this means, but I am using the same exact input as I am when I launch Naive Bayes and it works. (Training data presented as a list of pairs, the first member of which is the featureet and the second of which is the grading mark.) Any ideas?

Thank!

+3


source to share


3 answers


I also ran into this problem with NLTK. While I was not able to resolve it satisfactorily (i.e. get Maxent working using scipy), I was able to train the maximum classifier in NLTK when I used a different algorithm. Try to train with

me_classifier = nltk.MaxentClassifier.train(trainset,algorithm="iis")

      



or one of the other acceptable values ​​for the algorithm, such as "gis" or "megam".

+3


source


This issue also depends on which version of scipy you are using.

NLTK uses scipy.maxentropy which was deprecated in scipy 0.10 and removed in 0.11, see the docs for it: http://docs.scipy.org/doc/scipy-0.10.0/reference/maxentropy.html#



I created an issue for github: https://github.com/nltk/nltk/issues/307

+1


source


you have to install nltk then you can classify. use below code to classify using maximum entropy in python

me_classifier = nltk.MaxentClassifier.train(trainset,algorithm="gis")
print(me_classifier.classify(testing))

      

0


source







All Articles