Importing the content from MySQL to HDFS as sequence files using below sqoop import command
sqoop import --connect "jdbc:mysql://quickstart.cloudera:3306/retail_db"
--username retail_dba --password cloudera
--table orders
--target-dir /user/cloudera/sqoop_import_seq/orders
--as-sequencefile
--lines-terminated-by '\n' --fields-terminated-by ','
Then i'm creating the hive table using the below command
create table orders_seq(order_id int,order_date string,order_customer_id int,order_status string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '|'
STORED AS SEQUENCEFILE
But when I tried to load sequence data obtained from 1st command into hive table using the below command
LOAD DATA INPATH '/user/cloudera/sqoop_import_seq/orders' INTO TABLE orders_seq;
It is giving the below error.
Loading data to table practice.orders_seq
Failed with exception java.lang.RuntimeException: java.io.IOException: WritableName can't load class: orders
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MoveTask
Where am I going wrong?
First of all, It's necessary to have the data in that format?
Let's suppose you have to have the data in that format. The load data command is not necessary. Once the sqoop finishes importing data, you will just have to create a Hive table pointing the same directory where you sqoop the data.
One side note from your scripts:
Your sqoop command says this:
--fields-terminated-by ','
but when you are creating the table you are using:FIELDS TERMINATED BY '|'
In my experience, the best approach I thing is to sqoop the data as avro, this will create automatically an avro-schema. Then you will just to have to create a Hive table using the schema previously created (AvroSerde) and using the location where you stored the data you got from sqooping process.