有没有办法将节点添加到正在运行的Hadoop集群？(Is there a way to add no

我一直在玩的Cloudera和我定义集群的数量之前，我开始我的工作，然后使用Cloudera的经理，以确保一切运行。

我的工作而不是使用Hadoop是使用消息队列分发工作的新项目，但工作的结果存储在HBase的。我可能会推出10台服务器来处理作业和存储到HBase的，但我想知道如果我后来决定添加一些工人节点可我很容易（读：可编程）让他们自动连接到正在运行的集群，使他们可以在本地添加以集群的HBase / HDFS？

这是可能的，什么我需要为了做它来学习？

Answer 1:

这里是将节点添加到文档的Hadoop和HBase的。综观文档，没有必要重新启动集群。一个节点可以动态地加入。

Answer 2:

下面的步骤可以帮助你启动新节点到正在运行的集群。

1> Update the /etc/hadoop/conf/slaves list with the new node-name
2> Sync the full configuration /etc/hadoop/conf to the new datanode from the Namenode. If the file system isn't shared.  
2>  Restart all the hadoop services on Namenode/Tasktracker and all the services on the new Datanode. 
3>  Verify the new datanode from the browser http://namenode:50070
4>  Run the balancer script to readjust the data between the nodes.

如果你不希望重新启动对NN的服务，当您添加一个新的节点。我想说提前加名奴隶的配置文件。因此，他们报告为退役/死节点，直到他们都可用。按照上述数据节点只有几步。同样，这不是最好的做法。