I am trying to install Spark 1.6.1 on windows 10 and so far I have done the following...
- Downloaded spark 1.6.1, unpacked to some directory and then set SPARK_HOME
- Downloaded scala 2.11.8, unpacked to some directory and then set SCALA_HOME
- Set the _JAVA_OPTION env variable
- Downloaded the winutils from https://github.com/steveloughran/winutils.git by just downloading the zip directory and then set HADOOP_HOME env variable. (Not sure if this was incorrect, I could not clone the directory because of permission denied).
When I go to spark home and run bin\spark-shell I get
'C:\Program' is not recognized as an internal or external command, operable program or batch file.
I must be missing something, I don't see how I could be running the bash scripts anyway from windows environment. But hopefully I don't need to understand just to get this working. I have been following this guy's tutorial - https://hernandezpaul.wordpress.com/2016/01/24/apache-spark-installation-on-windows-10/ . Any help would be appreciated.
You need to download the winutils executable, not source code.
You can download it here, or if you really want the entire Hadoop distribution you can find the 2.6.0 binaries here. Then, you need to set
HADOOP_HOME
to the directory containing winutils.exe.Also, make sure the directory you place Spark in is a directory that doesn't contain whitespaces, this is extremely important otherwise it won't work.
Once you've set it up, you don't start
spark-shell.sh
, you startspark-shell.cmd
: