I'm testing the framework Apache Spark. I need monitoring some aspects about my cluster like network and resources.
Ganglia looks like a good option for what I need. Then, I found out that Spark has support to Ganglia.
On the Spark monitoring webpage there is this information: "To install the GangliaSink you’ll need to perform a custom build of Spark."
I found in my Spark the directory: "/extras/spark-ganglia-lgpl". But I don't know how to install it.
How can I install the Ganglia to monitoring Spark cluster? How I do this custom build?
Thanks!
Spark Ganglia support is one of Maven profiles of Spark project and it's "spark-ganglia-lgpl". In order to activate the profile, you put "-Pspark-ganglia-lgpl" option in mvn command when you build the project. For example, building Apache Hadoop 2.4.X with Ganglia is done by
For building the Spark project, please refer to documentation of Building Spark with Maven
So if your running the HDP stack, i would recommend updating to the latests version. It includes the spark job tracker as well as the spark client libraries to be deployed on machines. It also will now integrate with ambari metrics which is set to replace Ganglia and Nagios