What should be hadoop.tmp.dir ?

Hadoop has configuration parameter hadoop.tmp.dir which, as per documentation, is `"A base for other temporary directories." I presume, this path refers to local file system.

I set this value to /mnt/hadoop-tmp/hadoop-${user.name}. After formatting the namenode and starting all services, I see exactly same path created on HDFS.

Does this mean, hadoop.tmp.dir refers to temporary location on HDFS?

标签： hadoop hdfs config

3条回答

\"骚年 ilove

2楼-- · 2019-01-31 13:04

It's confusing, but hadoop.tmp.dir is used as the base for temporary directories locally, and also in HDFS. The document isn't great, but mapred.system.dir is set by default to "${hadoop.tmp.dir}/mapred/system", and this defines the Path on the HDFS where where the Map/Reduce framework stores system files.

If you want these to not be tied together, you can edit your mapred-site.xml such that the definition of mapred.system.dir is something that's not tied to ${hadoop.tmp.dir}

0人赞添加讨论(0) 举报

乱世女痞

3楼-- · 2019-01-31 13:23

Had a look around for information on this one. Only thing I could come up with was this post on the Amazon Elastic MapReduce Dev Guide:

In hadoop-site.xml, we set hadoop.tmp.dir to /mnt/var/lib/hadoop/tmp. /mnt is where we mount the “extra” EC2 volumes, which can contain a lot more data than the default volume. (The exact amount depends on instance type.) Hadoop's RunJar.java (the module that unpacks the input JARs) interprets hadoop.tmp.dir as a Hadoop file system path rather than a local path, so it writes to the path in HDFS instead of a local path. HDFS is mounted under /mnt (specifically /mnt/var/lib/hadoop/dfs/. So, you can write lots of data to it.

0人赞添加讨论(0) 举报

Anthone

4楼-- · 2019-01-31 13:28

Let me add a bit more to kkrugler's answer:

There're three HDFS properties which contain hadoop.tmp.dir in their values

dfs.name.dir: directory where namenode stores its metadata, with default value ${hadoop.tmp.dir}/dfs/name.
dfs.data.dir: directory where HDFS data blocks are stored, with default value ${hadoop.tmp.dir}/dfs/data.
fs.checkpoint.dir: directory where secondary namenode store its checkpoints, default value is ${hadoop.tmp.dir}/dfs/namesecondary.

This is why you saw the /mnt/hadoop-tmp/hadoop-${user.name} in your HDFS after formatting namenode.

0人赞添加讨论(0) 举报

What should be hadoop.tmp.dir ?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间