可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
I am running Spark on Windows 7. When I use Hive, I see the following error
The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-
The permissions are set as the following
C:\tmp>ls -la
total 20
drwxr-xr-x 1 ADMIN Administ 0 Dec 10 13:06 .
drwxr-xr-x 1 ADMIN Administ 28672 Dec 10 09:53 ..
drwxr-xr-x 2 ADMIN Administ 0 Dec 10 12:22 hive
I have set "full control" to all users from Windows->properties->security->Advanced.
But I still see the same error. Any help please?
I have checked a bunch of links, some say this is a bug on Spark 1.5. Is this true?
Thanks
Aarthi
回答1:
First of all, make sure you are using correct Winutils for your OS. Then next step is permissions.
On windows,you need to run following command on cmd:
D:\winutils\bin\winutils.exe chmod 777 D:\tmp\hive
Hope you have downloaded winutils already n set the HADOOP_HOME.
回答2:
First thing first check your computer domain. Try
c:\work\hadoop-2.2\bin\winutils.exe ls c:/tmp/hive
If this command says access denied or FindFileOwnerAndPermission error (1789): The trust relationship between this workstation and the primary domain failed.
It means your computer domain controller is not reachable , possible reason could be you are not on same VPN as your system domain controller.Connect to VPN and try again.
Now try the solution provided by Viktor or Nishu.
回答3:
Next solution worked on Windows for me:
- First, I defined HADOOP_HOME. It described in detail here
- Next, I did like Nishu Tayal, but with one difference:
C:\temp\hadoop\bin\winutils.exe chmod 777 \tmp\hive
\tmp\hive
is not local directory
回答4:
You need to set this directory's permissions on HDFS, not your local filesystem. /tmp
doesn't mean C:\tmp
unless you set fs.defaultFs
in core-site.xml to file://c:/
, which is probably a bad idea.
Check it using
hdfs dfs -ls /tmp
Set it using
hdfs dfs -chmod 777 /tmp/hive
回答5:
Error while starting the spark-shell on VM running on Windows:
Error msg: The root scratch dir: /tmp/hive on HDFS should be writable. Permission denied
Solution:
/tmp/hive is temporary directory. Only temporary files are kept in this
location. No problem even if we delete this directory, will be created when
required with proper permissions.
Step 1) In hdfs, Remove the /tmp/hive directory ==> "hdfs dfs -rm -r /tmp/hive"
2) At OS level too, delete the dir /tmp/hive ==> rm -rf /tmp/hive
After this, started the spark-shell and it worked fine..
回答6:
There is a bug in Spark Jira for the same. This has been resolved few days back. Here is the link.
https://issues.apache.org/jira/browse/SPARK-10528
Comments have all options, but no guaranteed solution.
回答7:
Issue resolved in spark version 2.0.2 (Nov 14 2016). Use this version .
Version 2.1.0 Dec 28 2016 release has same issues.
回答8:
The main reason is you started the spark at wrong directory. please create folders in D://tmp/hive (give full permissions) and start your spark in D: drive
D:> spark-shell
now it will work.. :)
回答9:
This is a simple 4 step process:
For Spark 2.0+:
- Download Hadoop for Windows / Winutils
Add this to your code (before SparkSession initialization):
if(getOS()=="windows"){
System.setProperty("hadoop.home.dir", "C:/Users//winutils-master/hadoop-2.7.1");
}
Add this to your spark-session (You can change it to C:/Temp
instead of Desktop).
.config("hive.exec.scratchdir","C:/Users//Desktop/tmphive")
Open cmd.exe and run:
"path\to\hadoop-2.7.1\bin\winutils.exe" chmod 777 C:\Users\\Desktop\tmphive
回答10:
I also faced this issue. This issue is related to network. I installed spark on Windows 7 using particular domain.
Domain name can be checked
Start -> computer -> Right click -> Properties -> Computer name,
domain and workgroup settings -> click on change -> Computer Name
(Tab) -> Click on Change -> Domain name.
When I run spark-shell command, it works fine, without any error.
In other networks I received write permission error.
To avoid this error, run spark command on Domain specified in above path.
回答11:
Can please try giving 777 permission to the folder /tmp/hive because what I think is that spark runs as a anonymous user(which will come in other user category) and this permission should be recursive.
I had this same issue with 1.5.1 version of spark for hive, and it worked by giving 777 permission using below command on linux
chmod -r 777 /tmp/hive
回答12:
Use the latest version of "winutils.exe" and try. https://github.com/steveloughran/winutils/blob/master/hadoop-2.7.1/bin/winutils.exe
回答13:
Using the correct version of winutils.exe did the trick for me. The winutils should be from the version of Hadoop that Spark has been pre built for.
Set HADOOP_HOME environment variable to the bin location of winutils.exe. I have stored winutils.exe along with C:\Spark\bin files. So now my SPARK_HOME and HADOOP_HOME point to the same location C:\Spark
.
Now that winultils has been added to path, give permissions for hive folder using winutils.exe chmod 777 C:\tmp\hive
回答14:
I just solved this issue in my Win7 environment. I modified the DNS setting with an incorrect IP. It makes my desktop failed to connect to the domain controller. Once I set the correct DNS IP, and restart the machine. The issue is gone. I can use winutils to ls a directory.