I'd like to know how to find the mapping between Hive tables and the actual HDFS files (or rather, directories) that they represent. I need to access the table files directly.
Where does Hive store its files in HDFS?
I'd like to know how to find the mapping between Hive tables and the actual HDFS files (or rather, directories) that they represent. I need to access the table files directly.
Where does Hive store its files in HDFS?
In Hive, tables are actually stored in a few places. Specifically, if you use partitions (which you should, if your tables are very large or growing) then each partition can have its own storage.
To show the default location where table data or partitions will be created if you create them through default HIVE commands: (
insert overwrite ... partition ...
and such):To show the actual location of a particular partition within a HIVE table, instead do this:
If you look in your filesystem where a table "should" live, and you find no files there, it's very likely that the table is created (usually incrementally) by creating a new partition and pointing that partition at some other location. This is a great way of building tables from things like daily imports from third parties and such, which avoids having to copy the files around or storing them more than once in different places.
It's also very possible that typing
show create table <table_name>
in the hive cli will give you the exact location of your hive table.Summarize few points posted earlier, in hive-site.xml, property hive.metastore.warehouse.dir specifies where the files located under hadoop HDFS
To view files, use this command:
or
tested under hadoop-2.7.3, hive-2.1.1
In sandbox , you need to go for /apps/hive/warehouse/ and normal cluster /user/hive/warehouse