sqoop fails to store incremental state to the meta

2020-02-15 03:37发布

问题:

I get this on saving incremental import state

16/05/15 21:43:05 INFO tool.ImportTool: Saving incremental import state to the metastore
16/05/15 21:43:56 ERROR tool.ImportTool: Encountered IOException running import job: java.io.IOException: Error communicating with database
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:426)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.update(HsqldbJobStorage.java:445)
at org.apache.sqoop.tool.ImportTool.saveIncrementalState(ImportTool.java:164)
at org.apache.sqoop.tool.ImportTool.importTable(ImportTool.java:518)
at org.apache.sqoop.tool.ImportTool.run(ImportTool.java:605)
at org.apache.sqoop.tool.JobTool.execJob(JobTool.java:228)
at org.apache.sqoop.tool.JobTool.run(JobTool.java:283)
at org.apache.sqoop.Sqoop.run(Sqoop.java:148)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:184)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:226)
at org.apache.sqoop.Sqoop.runTool(Sqoop.java:235)
at org.apache.sqoop.Sqoop.main(Sqoop.java:244)
Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3597)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3529)
at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1990)
at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:2151)
at com.mysql.jdbc.ConnectionImpl.execSQL(ConnectionImpl.java:2625)
at com.mysql.jdbc.PreparedStatement.executeInternal(PreparedStatement.java:2119)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2415)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2333)
at com.mysql.jdbc.PreparedStatement.executeUpdate(PreparedStatement.java:2318)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.setV0Property(HsqldbJobStorage.java:707)
at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.createInternal(HsqldbJobStorage.java:391)
... 12 more 

I have changed default metastore to MySQL. Everything else has been working just fine . I understand that some of the sources say that sqoop does not support it. In that case, I also need to know what is the default HSQL db url that I need to specify in --metaconnect argument?

In sqoop site.xml, I see that it is :

jdbc:hsqldb:file:/tmp/sqoop-meta/meta.db;shutdown=true

But what do I specify in --meta-connect? I am sure this is not to be specified in there.

I need to specify meta-connect explicitly because I am running jobs through oozie and so will need to specify meta-connect.

I have checked and rechecked but nothing is running on port 16000. sqoop-metastore command can ofcourse start it but then it's not running as a service in that case.

so what I am asking is either a way to run sqoop metastore as a service or need to know why is this failing to store state in metastore when it is configured for mysql.

回答1:

It is not very difficult to start a HSQLDB instance as a Linux service somewhere. Even if you stick to the obsolete V1.8 that is packaged with Sqoop1.

=> Running Hsqldb (1.8) as a System Daemon

Then you have to backup the data periodically => connect with JDBC client, run a CHECKPOINT, backup the "script" file that contains all SQL necessary to rebuild the DB snapshot.

Might be much easier than trying to hack a MySQL JDBC connection into that source code that expects HSQLDB.

Ah, and for no additional cost, you will find here some explanations about how to use Java properties in the default conf files instead of command-line arguments.



回答2:

Ran into the same problem today with MySQL and found out why.

Sqoop locks himself out i guess by using different jdbc connections within the same process. By default, MariaDB (MySQL) creates table using the INNODB engine witch introduces transactions... I guess nobody tested Sqoop with INNODB.

All I did was to recreate the SQOOP_SESSIONS table in the metastore and used MyISAM engine.

CREATE TABLE `SQOOP_SESSIONS_n` (
  `job_name` varchar(64) NOT NULL,
  `propname` varchar(128) NOT NULL,
  `propval` varchar(1024) DEFAULT NULL,
  `propclass` varchar(32) NOT NULL,
  UNIQUE KEY `SQOOP_SESSIONS_n_unq` (`job_name`,`propname`,`propclass`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1;

drop table SQOOP_SESSIONS;

rename table SQOOP_SESSIONS_n to SQOOP_SESSIONS;

Of course you wouldn't want to loose your created jobs if you had any. Just copy them BEFORE dropping the table:

insert into SQOOP_SESSIONS_n select * from SQOOP_SESSIONS;


回答3:

It looks like you error is on the mySQL side. Caused by: java.sql.SQLException: Lock wait timeout exceeded; try restarting transaction

Can you send the complete command you used to run this.

Check the following in MySQL:- show engine innodb status\G

You should consider increasing the lock wait timeout value for InnoDB by setting the innodb_lock_wait_timeout, default is 50 sec

show variables like 'innodb_lock_wait_timeout';

you can set it to higher value in /etc/my.cnf permanently with this line

innodb_lock_wait_timeout=120 and restart mysql.

If you cannot restart mysql at this time, run this:

SET GLOBAL innodb_lock_wait_timeout = 120; You could also just set it for the duration of your session

SET innodb_lock_wait_timeout = 120;



标签: sqoop oozie