how does hdfs choose a datanode to store

2019-01-23 07:44发布

As the title indicates, when a client requests to write a file to the hdfs, how does the HDFS or name node choose which datanode to store the file? Does the hdfs try to store all the blocks of this file in the same node or some node in the same rack if it is too big? Does the hdfs provide any APIs for applications to store the file in a certain datanode as he likes?

标签: hadoop hdfs
5条回答
家丑人穷心不美
2楼-- · 2019-01-23 08:31

The code for choosing datanode is in function ReplicationTargetChooser.chooseTarget().

The comment says that :

The replica placement strategy is that if the writer is on a datanode, the 1st replica is placed on the local machine, otherwise a random datanode. The 2nd replica is placed on a datanode that is on a different rack. The 3rd replica is placed on a datanode which is on the same rack as the first replica.

It doesn`t provide any API for applications to store the file in the datanode they want.

查看更多
你好瞎i
3楼-- · 2019-01-23 08:32

Now with Hadoop-385 patch, we can choose the block placement policy, so as to place all blocks of a file in the same node (and similarly for replicated nodes). Read this blog about this topic - look at the comments section.

查看更多
我想做一个坏孩纸
4楼-- · 2019-01-23 08:35

If someone prefers charts, here is a picture (source):
enter image description here

查看更多
虎瘦雄心在
5楼-- · 2019-01-23 08:37

how does the HDFS or name node choose which datanode to store the file?

HDFS has a BlockPlacementPolicyDefault, check the API documentation for more details. It should be possible to extend BlockPlacementPolicy for a custom behavior.

Does the hdfs provide any APIs for applications to store the file in a certain datanode as he likes?

The placement behavior should not be specific to a particular datanode. That's what makes HDFS resilient to failure and also scalable.

查看更多
啃猪蹄的小仙女
6楼-- · 2019-01-23 08:43

this image shows how replication process is done[][1]

You can see that when namenode instructs datanode to store data. The first replica is stored in the local machine and other two replicas are made on other rack and so on.

If any replica fails, data is stored from other replica. Chances of failing every replica is just like falling of fan on your head while you were sleeping :p i.e. there is very less chance for that.

查看更多
登录 后发表回答