Sqoop export error - cause:org.apache.hadoop.mapre

2019-09-08 17:58发布

I am developing a java program.

The java program exports data from hive to mysql.

First, I write the code

ProcessBuilder pb = new ProcessBuilder("sqoop-export", "export", 
         "--connect",               "jdbc:mysql://localhost/mydb", 
         "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
         "--table",                    "mytable", 
         "--export-dir",            "/user/hive/warehouse/tbl_2", 
         "--username",            "yoonhok", 
         "--password",            "1234");

try {
    Process p = pb.start();
    if (p.waitFor() != 0) {
        System.out.println("Error: sqoop-export failed.");
        return false;
    }
} catch (IOException e) {
    e.printStackTrace();
} catch (InterruptedException e) {
    e.printStackTrace();
}

It works perfectly.

But I learned a new way of using sqoop in java.

Sqoop doesn't support client api yet.

So I added sqoop lib and just write Sqoop.run()

Second, I write the code again with new way.

String[] str = {"export", 
     "--connect",               "jdbc:mysql://localhost/mydb", 
     "--hadoop-home",    "/home/yoonhok/development/hadoop-1.1.1", 
     "--table",                    "mytable", 
     "--export-dir",            "/user/hive/warehouse/tbl_2", 
     "--username",            "yoonhok", 
     "--password",            "1234"
};

if (Sqoop.runTool(str) == 1) {
     System.out.println("Error: sqoop-export failed.");
     return false;
}

But It doesn't running.

I got error......

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 17:43:12 WARN sqoop.ConnFactory: $SQOOP_CONF_DIR has not been set in the environment. Cannot check for additional configuration.
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

I saw $SQOOP_CONF_DIR has not been set in the environment.

so I added

SQOOP_CONF_DIR=/home/yoonhok/development/sqoop-1.4.2.bin__hadoop-1.0.0/conf

in the

/etc/environment

And try again, but Error...

13/02/14 16:17:09 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead. 
13/02/14 16:17:09 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset. 
13/02/14 16:17:09 INFO tool.CodeGenTool: Beginning code generation 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1 
13/02/14 16:17:09 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1 
Note: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.java uses or overrides a deprecated API. 
Note: Recompile with -Xlint:deprecation for details. 
13/02/14 16:17:10 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/45dd1a113123726796a4ed4ce10c9110/tbl_2.jar 
13/02/14 16:17:10 INFO mapreduce.ExportJobBase: Beginning export of tbl_2 
13/02/14 16:17:10 WARN mapreduce.ExportJobBase: Input path file:/user/hive/warehouse/tbl_2 does not exist 
13/02/14 16:17:11 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 
13/02/14 16:17:11 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok314601126/.staging/job_local_0001 
13/02/14 16:17:11 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2 
13/02/14 16:17:11 ERROR tool.ExportTool: Encountered IOException running export job: org.apache.hadoop.mapreduce.lib.input.InvalidInputException: Input path does not exist: file:/user/hive/warehouse/tbl_2

I think that Export-dir is problem.

I use "/user/hive/warehouse/tbl_2".

And When I run "hadoop fs -ls /user/hive/warehouse/", the table "tbl_2" exist.

I think that

"Input path does not exist: file:/user/hive/warehouse/tbl_2" is not ok.

"Input path does not exist: hdfs:/user/hive/warehouse/tbl_2" is ok.

But I don't know how can I fix it.


Ok just before I got a hint.

And I edited 'export-dir'

--export-dir   hdfs://localhost:9000/user/hive/warehouse/tbl_2

But... It's error... T.T

13/02/15 15:17:20 WARN tool.BaseSqoopTool: Setting your password on the command-line is insecure. Consider using -P instead.
13/02/15 15:17:20 INFO manager.MySQLManager: Preparing to use a MySQL streaming resultset.
13/02/15 15:17:20 INFO tool.CodeGenTool: Beginning code generation
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO manager.SqlManager: Executing SQL statement: SELECT t.* FROM `tbl_2` AS t LIMIT 1
13/02/15 15:17:20 INFO orm.CompilationManager: HADOOP_HOME is /home/yoonhok/development/hadoop-1.1.1/libexec/..
Note: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/02/15 15:17:22 INFO orm.CompilationManager: Writing jar file: /tmp/sqoop-yoonhok/compile/697590ee9b90c022fb8518b8a6f1d86b/tbl_2.jar
13/02/15 15:17:22 INFO mapreduce.ExportJobBase: Beginning export of tbl_2
13/02/15 15:17:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO input.FileInputFormat: Total input paths to process : 1
13/02/15 15:17:23 INFO mapred.JobClient: Cleaning up the staging area file:/tmp/hadoop-yoonhok/mapred/staging/yoonhok922915382/.staging/job_local_0001
13/02/15 15:17:23 ERROR security.UserGroupInformation: PriviledgedActionException as:yoonhok cause:java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.
13/02/15 15:17:23 ERROR tool.ExportTool: Encountered IOException running export job: java.io.FileNotFoundException: File /user/hive/warehouse/tbl_2/000000_0 does not exist.

When I checked hdfs,

hadoop fs -ls /user/hive/warehouse/tbl_2

or

hadoop fs -ls hdfs://localhost:9000/user/hive/warehouse/tbl_2

the file exist.

-rw-r--r-- 1 yoonhok supergroup 14029022 2013-02-15 12:16 /user/hive/warehouse/tbl_2/000000_0

I try in the shell command in terminal

sqoop-export --connect jdbc:mysql://localhost/detector --table tbl_2 --export-dir hdfs://localhost:9000/user/hive/warehouse/tbl_2 --username yoonhok --password 1234

It's work.

What's problem?

I don't know.

Could you help me?

1条回答
Luminary・发光体
2楼-- · 2019-09-08 18:19

You need to load and provide your Hadoop configuration files. By default they are read from classpath, but you might be able to override this by Configuration.setDefaultResource (without guarantees).

查看更多
登录 后发表回答