Sqoop import newly added column to mysql table to

2019-09-13 03:57发布

I have table test in mysql as below:

id  name  address
1  Km  sky
2  hd  heaven
3  Ab  null
4  en  null

Now I done a sqoop import as below

sqoop import--connect jdbc:mysql://XXXXXX/testing --username XXXX --password XXXX --query "select * from  testing.test where \$CONDITIONS" --null-string '' --null-non-string '' -m 1\ 
--hive-import --hive-database testing --hive-table test --create-hive-table --target-dir  /user/hive/warehouse/testing.db/test

I got the desired result.

Then we added a new colum to the mysql table with extra 2 rows

id  name  address  nation

1  Km  sky  null
2  hd  heaven  null
3  Ab  null  null
4  en  null  null
5  abc efd  USA
6  fge cde  UK

Now I want the existing hive table with the above columns and rows updated. I have done the following sqoop job

Sqoop job:

sqoop job --create sqoop_test -- import --connect jdbc:mysql:xxxxxxx/testing --username XXXXX --password XXXX --query "SELECT * from testing.test WHERE \$CONDITIONS" --incremental append\ 
--check-column id --last-value "3" --split-by 'id' --target-dir  /user/hive/warehouse/testing.db/test 

But when I query the hive table I get the result as null for the new rows and the new columns doesn't show up. Like below

id  name  address

NULL  NULL  NULL
NULL  NULL  NULL
1  Km  sky
2  hd  heaven
3  Ab  
4  en  

How can we have the new columns appended and new rows added to the existing table in hive?

Or Is the method I am using is completely wrong. Please let me know

标签: mysql hive sqoop
1条回答
对你真心纯属浪费
2楼-- · 2019-09-13 04:34

your assumption is wrong, the reason for this is because you are importing data with a different layout. The first table that you created, has 3 columns and in the second import, you are importing 4 columns, for that reason, Hive can't parse those new records and simply print null for all the columns. If you don't have a good reason to import the data in textfile format, I suggest you create the table in avro and user the schema evolution feature to add new columns.

When you import data in avro, Sqoop autogenerate the schems for you. so the only thing that you need is create a table pointing to the imported data and use the generated schema. In the case of future imports with new fields, you will need add those fields with a valid default value or make them nullables with default value as follow (for example for string column)

{ "name": "newcolumnname", "type": [ "null", "string" ], "default": "null" },

or even specify other valid default values

{ "name": "newcolumnname", "type": [ "string" ], "default": "val1" }, //default value 1
{ "name": "newcolumnname", "type": [ "string" ], "default": "" }, //default value empty
查看更多
登录 后发表回答