I am trying to extract data from Hive table and write to local files:
One output file per a column "Date" value. My Hive table will have about 2+ years history of data, that means I will need about 700+ different output files.
My current knowledge will only allow me to write one file per a run, this is my code can be run in Hive command line:
INSERT OVERWRITE LOCAL DIRECTORY '/local/hive/temp'
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
select date, col1, col2, col3, col4, col5
from WH_TEMP_EXTRACT.table_temp
where date='2015-09-17';
I am not a developer, but currently in the process of researching all options to perform this task. I appreciate any help you can provide here.
Extract all the 2 year data in a single query into the local file. After that you can use awk command to get them into individual files as below.
let me know if this solution will work for you.
EDIT 1: use gsub
EDIT 2:
EDIT 3: