I want to change an existing table, that contains text format, into orc format. I was able to do it by: (1) creating a table in orc format manually having the partitions and then, (2) using the INSERT OVERWRITE statement to populate the table.
I am trying to use CTAS (Create Table... AS Select...) statement for this. Is there any way I can include the dynamic partitioning with CTAS statement? So, if my text data set has multiple partitions (for example: year and month), can I point this in CTAS statement directly?
The format might be something like this:
CREATE TABLE TEST_TABLE
STORED AS ORC
WITH PARTITION(year, month)
LOCATION '/<my_location>'
tblproperties ("orc.compress"="SNAPPY")
AS SELECT * FROM <existing_table>;
Any idea please?
Note: The reason I'm interested for CTAS statement is as: using this statement, I don't really need to point out each and every columns name. But if I create an orc based table and then populate it manually, I had to indicate all the columns while creating the table. This is okay; but, not a good idea if my existing table contains a lot of columns.
Not supported
Two Steps:
Rest stored as ORC, location you can add as per convenience and support by hive query.
For external tables, the steps do not work. The issue comes is, data is updated in pointed directory on HDFS but we cant see changes reflected in table when we do select table. For this you may follow following steps:
Solution 1: Check tables metadata if its updated or not.
Solution 2: If solution 1 does not work, then:
Fix the table
Then you can drop source_table and rename
another_table
tosource_table_name
.