I am connecting to Hive via an ODBC driver from a .NET application. Is there a query to determine if a table already exists?
For example, in MSSQL you can query the INFORMATION_SCHEMA
table and in Netezza you can query the _v_table
table.
Any assistance would be appreciated.
There are two approaches by which you can check that:
1.) As @dimamah suggested, just to add one point here, for this approach you need to
1.1) start the **hiveserver** before running the query
1.2) you have to run two queries
1.2.1) USE <database_name>
1.2.2) SHOW TABLES LIKE 'table_name'
1.2.3) Then you check your result using Result set.
2.) Second approach is to use HiveMetastoreClient APIs, where you can directly use the APIs to check whether the table_name exist in a particular database or not.
For further help please go through this Hive 11
Execute the following command : show tables in DB like 'TABLENAME'
If the table exists, its name will be returned, otherwise nothing will be returned.
This is done directly from hive. for more options see this.
DB
is the database in which you want to see if the table exists.
TABLENAME
is the table name you seek,
What actually happens is that Hive queries its metastore (depends on your configuration but it can be in a standard RDBMS like MySQL) so you can optionally connect directly to the same metastore and write your own query to see if the table exists.
When programming on Hive by Spark SQL, you can use following method to check whether Hive table exists.
if (hiveContext.hql("SHOW TABLES LIKE '" + tableName + "'").count() == 1) {
println(tableName + " exists")
}
If someone is using shell script like me then my answer could be useful. Assume that your table is in the default namespace.
table=your_hive_table
validateTable=$(hive --database default -e "SHOW TABLES LIKE '$table'")
if [[ -z $validateTable ]]; then
echo "Error:: $table cannot be found"
exit 1
fi
Code similar to below one can find in many of my Spark notebooks:
stg_table_exists = sqlCtx.sql("SHOW TABLES IN "+ stg_db)
.filter("tableName='%s'" % stg_tab_name) .collect()
(made two-liner for readability)
I wish Spark would have an API call to check the same.
If you're using SparkSQL you can do the following.
if "table_name" in sqlContext.tableNames("db_name"):
...do something
http://spark.apache.org/docs/2.1.0/api/python/pyspark.sql.html#pyspark.sql.SQLContext.tableNames
If you're using a scala spark app and SparkSQL you can do the following
if spark.catalog.tableExists("tablename") {do something}