So, I have a txt file that has the below content in it:
CREATE EXTERNAL TABLE `table1`(
`tab_id bigint COMMENT 'The unique identifier of thetable')
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1556u3ehw27')
CREATE TABLE `table2`(
`count` bigint)
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
'hdfs://path/'
TBLPROPERTIES (
'transient'='15407')
As you can see, after each table's DDL, there is no ; at the end of it.I am trying to write a program that inserts a ; after each one of the tables DDL's. So the output should be like:
CREATE EXTERNAL TABLE `table1`(
`tab_id bigint COMMENT 'The unique identifier of thetable')
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
*
TBLPROPERTIES (
'transient_lastDdlTime'='1556u3ehw27');
CREATE TABLE `table2`(
`count` bigint)
ROW FORMAT SERDE
*
STORED AS INPUTFORMAT
*
OUTPUTFORMAT
*
LOCATION
'hdfs://path/'
TBLPROPERTIES (
'transient'='15407');
There are two ways I have tried this. (1) By adding it the the DDL creation script and a python program.
Below is my DDL creation .sh script which runs through my database's tables and generates a single file for all the tables within a database.I tried to use the cat function shown below on the last line(# cat...) to do that but kept receiving errors.
hiveDBName=my_db;
showcreate="show create table "
showpartitions="show partitions "
terminate=";"
tables=`hive -e "use $hiveDBName;show tables;"`
tab_list=`echo "${tables}"`
rm -f ${hiveDBName}_all_table_partition_DDL.sql
for list in $tab_list
do
echo "Generating table script for " ${hiveDBName}.${list}
showcreatetable=${showcreatetable}${showcreate}${hiveDBName}.${list}${terminate}
done
echo " ====== Create Tables ======= : " $showcreatetable
##Remove the file
rm -f ${hiveDBName}_extract_all_tables.txt
hive -e "use $hiveDBName; ${showcreatetable}" > /home/path/filter_ddls/aa.sql
grep -v "WARN" /home/path/filter_ddls/aa.sql >/home/path/hive_db_ddls/${hiveDBName}_extract_all_tables.sql
# cat a1.sql + ";\n\n" >> ${hiveDBName}_extract_all_tables.sql
Below is my Python program but the output of this method adds ; only after tblproperties which skips certain tables.
import re
f = open("/home/path/ddl.sql", 'rt', encoding='latin-1').read()
with open("/home/path/new_ddl.sql","w") as output:
output.write(re.sub(r'(TBLPROPERTIES \(.*?\))', r'\1;', f, flags=re.DOTALL))
Any ideas or suggestion to make this happen? Preferable the first option (the .sh script).