I have setup a hive environment with Kerberos security enabled on a Linux server (Red Hat). And I need to connect from a remote windows machine to hive using JDBC.
So, I have hiveserver2 running in the linux machine, and I have done "kinit".
Now I try to connect from a java program on the windows side with a test program like this,
Class.forName("org.apache.hive.jdbc.HiveDriver");
String url = "jdbc:hive2://<host>:10000/default;principal=hive/_HOST@<YOUR-REALM.COM>"
Connection con = DriverManager.getConnection(url);
And I got the following error,
Exception due to: Could not open client transport with JDBC Uri:
jdbc:hive2://<host>:10000/;principal=hive/_HOST@YOUR-REALM.COM>:
GSS initiate failed
What am I doing here wrong ? I checked many forums, but couldn't get a proper solution. Any answer will be appreciated.
Thanks
If you were running your code in Linux, I would simply point to that post -- i.e. you must use System properties to define Kerberos and JAAS configuration, from conf files with specific formats.
And you have to switch the debug trace flags to understand subtile configuration issue (i.e. different flavors/versions of JVMs may have different syntax requirements, which are not documented, it's a trial-and-error process).
But on Windows there are additional problems:
- the Apache Hive JDBC driver has some dependencies on Hadoop JARs, especially when Kerberos is involved (see that post for details)
- these Hadoop JARs require "native libraries" -- i.e. a Windows port of Hadoop (which you have to compile yourself!! or download from an insecure source on the web!!) -- plus System properties
hadoop.home.dir
and java.library.path
pointing to the Hadoop home dir and its bin
sub-dir respectively
On the top of that, the Apache Hive driver has compatibility issues -- whenever there are changes in the wire protocol, newer clients cannot connect to older servers.
So I strongly advise you to use the Cloudera JDBC driver for Hive for your Windows clients. The Cloudera site just asks your e-mail.
After that you have a 80+ pages PDF manual to read, the JARs to add to your CLASSPATH, and your JDBC URL to adapt according to the manual.
Side note: the Cloudera driver is a proper JDBC-4.x compliant driver, no need for that legacy Class.forName()
...
The key for us when we ran into the problem, was as follows:
On your server there are certain kerberos principals listed that are allowed to operate on the data.
When we tried to run a query via JDBC, we didn't do the proper kinit on the client side.
In this case the solution is obvious:
On the windows client: do a kinit with the proper account before connecting