Spark control with ganglia

I am testing Apache Spark framework. I need to keep track of some aspects of my cluster such as network and resources.

Ganglia looks like a good option for what I need. Then I found out that Spark supported the Ganglia.

The Spark monitoring web page says, "To install GangliaSink, you need to build your own Spark build."

I found a directory in my Spark: "/ extras / spark-ganglia-lgpl". But I don't know how to install it.

How do I install Ganglia to monitor a Spark cluster? How do I do this assembly?



source to share

2 answers

Spark Support Ganglia is one of the Maven profiles of the Spark project and it is "spark-ganglia-lgpl". To activate the profile, you add the -Pspark-ganglia-lgpl option to the mvn command when creating the project. For example, building Apache Hadoop 2.4.X with Ganglia is done using

mvn -Pspark-ganglia-lgpl -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -DskipTests clean package


To create a Spark project see the Building Spark with Maven documentation



So, if you are using HDP stack, I would recommend updating to latests version. It includes a spark tracker as well as spark client libraries to be deployed to machines. It will also now integrate with ambari metrics, which will be replaced by Ganglia and Nagios



All Articles