Accuracy falls into the classification with Naive Bayes MlLib

Question

Accuracy falls into the classification with Naive Bayes MlLib

I used Naive Bayes' Mahout 0.9 algorithm to classify document data. For a specific train (2/3 data) and test (1/3 data), I got an accuracy in the 86% range. When I switched to Spark MLlib, the accuracy dropped to 82%. In both cases, a standard analyzer is used.

MlLib link: https://spark.apache.org/docs/latest/mllib-naive-bayes.html Mahout link: http://mahout.apache.org/users/classification/bayesian.html

Please help me in this regard as I have to use Spark on a production system very soon and this is blocking for me.

I found the problem and MlLib takes longer to classify the data compared to Mahout.

And can anyone help me to improve the accuracy using MlLib naive Bayes.

+3

machine-learning hadoop mahout bigdata apache-spark-mllib

Tinku 09 Sep 14 at 12:42

source to share