Hive External Table Skip First Row

2019-01-10 09:52发布

I am using Cloudera's version of Hive and trying to create an external table over a csv file that contains the column names in the first column. Here is the code that I am using to do that.

CREATE EXTERNAL TABLE Test ( 
  RecordId int, 
  FirstName string, 
  LastName string 
) 
ROW FORMAT serde 'com.bizo.hive.serde.csv.CSVSerde' 
WITH SerDeProperties (  
  "separatorChar" = ","
) 
STORED AS TEXTFILE 
LOCATION '/user/File.csv'

Sample Data

RecordId,FirstName,LastName
1,"John","Doe"
2,"Jane","Doe"

Can anyone help me with how to skip the first row or do I need to add an intermediate step?

标签: hive Cloudera
7条回答
孤傲高冷的网名
2楼-- · 2019-01-10 10:26

I also struggled with this and found no way to tell hive to skip first row, like there is e.g. in Greenplum. So finally I had to remove it from the files. e.g. "cat File.csv | grep -v RecordId > File_no_header.csv"

查看更多
登录 后发表回答