Sqoop on HDInsight does not close JDBC connection

2019-08-28 13:18发布

问题:

If I use Azure SQL or Azure MySQL as metastore for SQOOP jobs there seem to be a serious bug in Sqoop on HDInsight as it does not close connection properly for saved sqoop jobs.

Here is a repo steps:

  1. Use Azure SQL or Azure MySQL as SQOOP metastore and create an incremental import saved SQOOP job and then run it at the very end to get an exception:

----------------ON AZURE SQL------------

17/08/02 23:15:51 INFO tool.ImportTool: Updated data for job: FactOnlineSalesIncr
17/08/02 23:15:51 WARN tool.JobTool: IOException closing JobStorage: java.io.IOException: Exception committing connection
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.close(HsqldbJobStorage.java:227)
        at org.apache.sqoop.tool.JobTool.run(JobTool.java:314)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
Caused by: com.microsoft.sqlserver.jdbc.SQLServerException: SQL Server did not return a response. The connection has been closed.
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1745)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.terminate(SQLServerConnection.java:1732)
        at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:5424)
        at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:6734)
        at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:6686)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection$1ConnectionCommand.doExecute(SQLServerConnection.java:1834)
        at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:6276)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:1793)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.connectionCommand(SQLServerConnection.java:1839)
        at com.microsoft.sqlserver.jdbc.SQLServerConnection.commit(SQLServerConnection.java:2016)
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.close(HsqldbJobStorage.java:225)
        ... 7 more

------ON AZURE MYSQL-----------------:

17/08/03 21:01:16 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 230.3543 seconds (0 bytes/sec)
17/08/03 21:01:16 INFO mapreduce.ImportJobBase: Retrieved 1947500 records.
17/08/03 21:01:16 INFO tool.ImportTool: Final destination exists, will run merge job.
17/08/03 21:01:16 INFO tool.ImportTool: Moving data from temporary directory _sqoop/4aa19d895a3c4bb08f22934dc4a69eb5_FactOnlineSales to final destination /ContosoRetailDW00/Incremental/Bzip2/FactOnlineSales
17/08/03 21:01:17 WARN azure.AzureFileSystemThreadPoolExecutor: Disabling threads for Rename operation as thread count 0 is <= 1
17/08/03 21:01:17 INFO azure.AzureFileSystemThreadPoolExecutor: Time taken for Rename operation is: 287 ms with threads: 0
17/08/03 21:01:17 INFO tool.ImportTool: Saving incremental import state to the metastore
17/08/03 21:01:21 INFO tool.ImportTool: Updated data for job: FactOnlineSalesIncr
17/08/03 21:01:21 WARN tool.JobTool: IOException closing JobStorage: java.io.IOException: Exception committing connection
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.close(HsqldbJobStorage.java:227)
        at org.apache.sqoop.tool.JobTool.run(JobTool.java:314)
        at org.apache.sqoop.Sqoop.run(Sqoop.java:147)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
        at org.apache.sqoop.Sqoop.runSqoop(Sqoop.java:183)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:225)
        at org.apache.sqoop.Sqoop.runTool(Sqoop.java:234)
        at org.apache.sqoop.Sqoop.main(Sqoop.java:243)
Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLNonTransientConnectionException: Communications link failure during commit(). Transaction resolution unknown.
        at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
        at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
        at com.mysql.jdbc.Util.handleNewInstance(Util.java:425)
        at com.mysql.jdbc.Util.getInstance(Util.java:408)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:918)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:897)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:886)
        at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:860)
        at com.mysql.jdbc.ConnectionImpl.commit(ConnectionImpl.java:1559)
        at org.apache.sqoop.metastore.hsqldb.HsqldbJobStorage.close(HsqldbJobStorage.java:225)
        ... 7 more

Does anyone know how to get around it?