Spark control with ganglia

I am testing Apache Spark framework. I need to keep track of some aspects of my cluster such as network and resources.

Ganglia looks like a good option for what I need. Then I found out that Spark supported the Ganglia.

The Spark monitoring web page says, "To install GangliaSink, you need to build your own Spark build."

I found a directory in my Spark: "/ extras / spark-ganglia-lgpl". But I don't know how to install it.

How do I install Ganglia to monitor a Spark cluster? How do I do this assembly?

Thank!

+3


source to share


2 answers


Spark Support Ganglia is one of the Maven profiles of the Spark project and it is "spark-ganglia-lgpl". To activate the profile, you add the -Pspark-ganglia-lgpl option to the mvn command when creating the project. For example, building Apache Hadoop 2.4.X with Ganglia is done using

mvn -Pspark-ganglia-lgpl -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package

      



To create a Spark project see the Building Spark with Maven documentation

+1


source


So, if you are using HDP stack, I would recommend updating to latests version. It includes a spark tracker as well as spark client libraries to be deployed to machines. It will also now integrate with ambari metrics, which will be replaced by Ganglia and Nagios



0


source







All Articles