Listing MS SQL Server table in OOZIE via SQOOP Act

2019-09-17 10:12发布

I am able to execute the following SQOOP command in CLI perfectly.

sqoop list-tables
--connect 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'
--connection-manager org.apache.sqoop.manager.SQLServerManager
--driver com.microsoft.sqlserver.jdbc.SQLServerDriver 
-- --schema schma

But getting errors while trying out the same in OOZIE (HUE)

2055 [main] ERROR org.apache.sqoop.manager.CatalogQueryManager - Failed to list tables java.sql.SQLException: No suitable driver found for 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'

-

2057 [main] ERROR org.apache.sqoop.Sqoop - Got exception running Sqoop: java.lang.RuntimeException: java.sql.SQLException: No suitable driver found for 'jdbc:sqlserver://xx.xx.xx.xx\MSSQLSERVER2012:1433;username=usr;password=xxx;database=db'

How can we get it to work in oozie? (Working on Cloudera Hadoop Distribution)

1条回答
太酷不给撩
2楼-- · 2019-09-17 10:51

This worked for me using CDH 5.11 and the Hue Workflow Editor to create an Oozie>Sqoop1 workflow...but it REQUIRES you to hard code the UserName and Password arguments... Screenshots are included below.

Here is the Step-by-Step:

  1. Open the Hue > Workflow Editor
  2. Create a new workflow
  3. Drag the Sqoop 1 action into the the "drop your action here" grey box.
  4. Ignore the default Sqoop command box and instead hit the + to the right of the ARGUMENTS below the Sqoop command box to add a new argument.
  5. Add "import" without the double quote marks as the very first argument.
  6. Delete the entire content of the Sqoop command box, it needs to be empty.
  7. Add a new argument with the value of "--connect" without the double quotes.
  8. Add a new argument with the value of "jdbc:sqlserver://YourServerNameHere;database=YourDatabaseNameHere"
  9. Add a new argument with the value of "--username"
  10. Add a new argument with the value of "YourSQLServerNamedUserNameHere"
  11. Add a new argument with the value of "--password"
  12. Add a new argument with the value of "--query"
  13. Add a new argument with the value of "Select * from OptionalDBNameHere.SchemaNameHere.TableNameHere Where $CONDITIONS"
  14. Add a new argument with the value of "--delete-target-dir"
  15. Add a new argument with the value of "--target-dir"
  16. Add a new argument with the value of "hdfs://FDQServerName:PortNumber8020IsDefault/User/full/path/to/where/you/want/the/csv/file/placed/in/hdfs/NewFolderForThisTableHere" -- The last folder will be deleted and re-created each time you run the sqoop job.
  17. Add a new argument with the value of "num-mappers"
  18. Add a new argument with the value of "1"

Important:

A. The "Where $CONDITIONS" is critical to have at the end of the SQL Select statement in item 13. It will not run without it.

B. This uses a SQL Server Named User account with access to the DBServer Database and Table you want to Sqoop.

B. Entering arguments like this is required if your Named User does not have the default schema set to "dbo" or if the schema of your table is not the default schema for the database and user.

C. The SQL Server JDBC driver is placed correctly in your installation. For my particular version of Cloudera the location is: "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/sqoop/lib/sqljdbc41.jar" but you may also try putting it in either "/var/lib/oozie" or "/var/lib/sqoop"...not sure either of those work on their own.

D. I have not been successful at replacing the UserName and Password I hardcoded in as Arguments with values from a job.properties file. I believe it is possible but I have been unable to find anyone who can clearly show how to do it and days of brute force trial and error have been unsuccessful.

Here are screenshots showing what this looks like when done. SqoopCommandAsArguments SqoopCommandAsArgumentsSuccess

查看更多
登录 后发表回答