Confusion over Hadoop namenode memory usage

2019-05-26 02:21发布

I have a silly doubt on Hadoop namenode memory calculation.It is mentioned in Hadoop book (Definite guide) as

"Since the namenode holds filesystem metadata in memory, the limit to the number of files in a filesystem is governed by the amount of memory on the namenode. As a rule of thumb, each file, directory, and block takes about 150 bytes. So, for example, if you had one million files, each taking one block, you would need at least 300 MB of memory. While storing millions of files is feasible, billions is beyond the capability of current hardware."

Since each taking one block, namenode minimum memory should be 150MB and not 300MB.Please help me to understand why it is 300MB

标签： hadoop hadoop2

1条回答

叼着烟拽天下

2楼-- · 2019-05-26 02:50

I guess you read the second edition of Tom White's book. I have the third edition, and this reference to a post Scalability of the Hadoop Distributed File System. Into the post, I read the next sentence:

Estimates show that the name-node uses less than 200 bytes to store a single metadata object (a file inode or a block).

A file in HDFS NameNode is: A file inode + a block. Each reference to both have 150 bytes. 1.000.000 of files = 1.000.000 inodes + 1.000.000 block reference (In the example, each file occupied 1 block).

2.000.000 * 150 bytes ~= 300Mb

I put the link for you can verify if I commit a mistake in my argumentation.

0人赞添加讨论(0) 举报

Confusion over Hadoop namenode memory usage

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间