I have setup a cluster(YARN) using Ambari with 3 VMs as hosts.
Where I can find the value for HADOOP_CONF_DIR ?
# Run on a YARN cluster
export HADOOP_CONF_DIR=XXX
./bin/spark-submit \
--class org.apache.spark.examples.SparkPi \
--master yarn-cluster \ # can also be `yarn-client` for client mode
--executor-memory 20G \
--num-executors 50 \
/path/to/examples.jar \
1000
Install Hadoop as well. In my case I've installed it in /usr/local/hadoop
Setup Hadoop Environment Variables
Then set the conf directory
From
/etc/spark/conf/spark-env.sh
: