几个地方说减速器在一个Hadoop作业的默认#1。可以使用mapred.reduce.tasks符号手动设置减速机的数量。
当我运行一个蜂房的工作(在Amazon EMR,AMI 2.3.3),它具有减速大于一的一些数字。 纵观作业设置,事情已经设置mapred.reduce.tasks,我相信蜂巢。 它是如何选择一个号码?
注意:这里有运行蜂房的工作,应该是一个线索,而一些消息:
...
Number of reduce tasks not specified. Estimated from input data size: 500
In order to change the average load for a reducer (in bytes):
set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
set mapred.reduce.tasks=<number>
...