Spark-sql: Unable to instantiate org.apache.hadoop

2019-07-16 14:36发布

After quite some time, can't figure out how to identify root cause of below error when running spark-sql binary:

15/12/08 14:48:41 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:346)
    at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:109)
    at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:665)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:170)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:193)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:112)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1412)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:62)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:72)
    at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:2453)
    at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:2465)
    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:340)
    ... 11 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1410)
    ... 16 more
Caused by: javax.jdo.JDOFatalUserException: Class org.datanucleus.api.jdo.JDOPersistenceManagerFactory was not found.
NestedThrowables:
java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1175)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:808)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:701)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:310)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:339)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:248)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:223)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStore(HiveMetaStore.java:497)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:475)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:523)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:397)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.<init>(HiveMetaStore.java:356)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:54)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:59)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newHMSHandler(HiveMetaStore.java:4944)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:171)
    ... 21 more
Caused by: java.lang.ClassNotFoundException: org.datanucleus.api.jdo.JDOPersistenceManagerFactory
    at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
    at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
    at java.lang.Class.forName0(Native Method)
    at java.lang.Class.forName(Class.java:348)
    at javax.jdo.JDOHelper$18.run(JDOHelper.java:2018)
    at javax.jdo.JDOHelper$18.run(JDOHelper.java:2016)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.jdo.JDOHelper.forName(JDOHelper.java:2015)
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1162)
    ... 40 more

2条回答
啃猪蹄的小仙女
2楼-- · 2019-07-16 14:42

For spark versions > 2.0, you can approach below solutions:

Make sure the $HIVE_HOME/conf/hive-site.xml configuration which is pointing to complete path of metastore.

javax.jdo.option.ConnectionURL 

jdbc:derby:;databaseName=/home/hive/metastore_db;create=true 

JDBC connect string for a JDBC metastore

Place the hive-site.xml file in $SPARK_HOME/confto point SparkR to the same metastore as Hive Hope this solves your issue.

查看更多
你好瞎i
3楼-- · 2019-07-16 15:04

You are missing datanucleus-api-jdo.jar in your class path. This jar is used by hive. Try to figure out which version of this jar you need (probably latest will do). You can download it then from here: http://repo1.maven.org/maven2/org/datanucleus/datanucleus-api-jdo/

And add it to hive permanently by adding this to your .hiverc file:

  add jar datanucleus-api-jdo.jar
查看更多
登录 后发表回答