About hadoop hdfs filesystem rename

2020-03-30 04:09发布

问题:

I am storing lots of data into hdfs. And I need to move the files from one folder to another.

May I ask generally how much is the cost of filesystem's rename method? Say I have to move terabytes of data.

Thank you very much.

回答1:

Moving files in HDFS or any file system if implemented properly involves changes to the name space and not moving of the actual data. Going through the code only changes in the name space (memory and edit log) in the Name node are done.

From the NameNode.java class

  • The NameNode controls two critical tables:
  • 1) filename->blocksequence (namespace)
  • 2) block->machinelist ("inodes")

Only the first part needs to be modified, block to machine list need not be. I haven't tried it out, but I guess it should be OK.



回答2:

Rename is a metadata-only operation in HDFS. Therefore it is be very cheap like it is in a normal POSIX filesystem, too. No data is moved. The only server involved is the namenode.

The source code for the rename can be found here. Pretty straight forward.