Unable to find hadoop Configuration classes when s

2019-09-15 01:10发布

问题:

I'm working on testing out using Hadoop with the latest version of Sqoop2 (1.99.7), and when running the sqoop2-server, I get the following error:

Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/conf/Configuration
at org.apache.sqoop.security.authentication.SimpleAuthenticationHandler.secureLogin(SimpleAuthenticationHandler.java:36)
at org.apache.sqoop.security.AuthenticationManager.initialize(AuthenticationManager.java:98)
at org.apache.sqoop.core.SqoopServer.initialize(SqoopServer.java:57)
at org.apache.sqoop.server.SqoopJettyServer.<init>(SqoopJettyServer.java:67)
at org.apache.sqoop.server.SqoopJettyServer.main(SqoopJettyServer.java:177)

Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.conf.Configuration
        at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
        ... 5 more

My current setup:

  • Using the latest https://hub.docker.com/r/sequenceiq/hadoop-docker/ image for Hadoop
  • Downloaded the latest Sqoop2 binaries (http://sqoop.apache.org/) and installed them into a running container at /usr/lib/sqoop/
  • Run /usr/lib/sqoop/bin/sqoop2-server start

Best I can figure is the Hadoop classpath isn't being loaded by Sqoop, as the required JAR's appear to be located at /usr/local/hadoop/shared/*.

Most of the documentation I've been able to find online is for pre-1.99.7, but one major change in this version is that the Sqoop server moved from Tomcat to Jetty, so all of the catalina configuration options are moot.

Can someone help me figure out how to get Sqoop server to run?

回答1:

Ah, figured it out for the most part.

Looks like the sqoop.sh script loads all of the jars from the classpath based on the environment variables set. The docker container has all of the environment variables that it's looking for set to the root path of the Hadoop installation, while the script is expecting them to be pointing to the directories containing the JAR files.

HADOOP_PREFIX=/usr/local/hadoop
HADOOP_HDFS_HOME=/usr/local/hadoop
HADOOP_COMMON_HOME=/usr/local/hadoop
HADOOP_YARN_HOME=/usr/local/hadoop
HADOOP_CONF_DIR=/usr/local/hadoop/etc/hadoop
HADOOP_MAPRED_HOME=/usr/local/hadoop

So the script chooses this directory over the $HADOOP_HOME directory and subsequent subdirectories that are also called out in the script.

The final step then was to edit the sqoop.properties file and ensure the mapreduce config was set to the correct directory:

org.apache.sqoop.submission.engine.mapreduce.configuration.directory=/usr/local/hadoop/etc/hadoop

Then the server started!

I'll leave this here in case anyone else runs across this...