I have created a table in hive as
Create table(id int, Description String)
My data looks something as follows :
1|This will return corrupt data since there is a ',' in the first string. some text Change the data 2|There is prob in reading data sometext
After the data is loaded into hive since the default line terminator is \n, the description column cannot be read by hive, Hence it displays a NULL value. Can anyone suggest how to handle newline before loading into hive.
I know this question is old, but you have a couple of options. You can't control this with
FIELDS TERMINATED BY
, because that only controls what terminates the fields, not the records. Records in Hive are hard-coded to be terminated by the newline character (even though there is aLINES TERMINATED BY
clause, it is not implemented).InputFormat
that uses aRecordReader
that understands non-newline delimited records. Look at the code forLineReader
/LineRecordReader
andTextInputFormat
.By default hive takes in NEWLINE ('\N') as delimiter . You can change the delimiter using:
try adding the below property in hive-site.xml or you can just try for temporary hive session level.
hive.query.result.fileformat=SequenceFile