Hive files on hdfs not being deleted when managed

2019-02-14 16:19发布

问题:

When I drop a managed table from the Hive interactive command line, the underlying files that were created on hdfs in /user/hive/warehouse/<databasename>.db still exist. This causes issues when I recreate the table with the same name and try to do

INSERT INTO TABLE 

as it still contains the data that I loaded into those partitions (dt and hr partitions in my case) in my initial go around. Only if I use

INSERT OVERWRITE TABLE

will it then finally load the data properly, but my ETL needs to use INSERT INTO TABLE.

Any ideas? I'm about ready to just create the same table but with a different name, or just go in and delete the stuff on hdfs but I'm worried if that'll break the metastore or something. Lastly, I'm positive it is a managed table and not external.

回答1:

Sometimes Hive will delete the table metadata but silently fail to move the files to the trash. Have you checked the permissions on /user/<user>/.Trash? Ensure that the ETL user has proper permission for this folder.



标签: hadoop hive hdfs