Configuring Hadoop Cluster (Fully Distributed Mode)

Question

Configuring Hadoop Cluster (Fully Distributed Mode)

I am installing hasoop on a multisite cluster and I have a few questions:

Is it good to have NameNode

it ResourceManager

on the same computer?
What is the best role for the host system NameNode

, ResourceManager

or DataNode/NodeManager

?.

I have a master and 3 slaves. The slave file on the master computer contains the following data:

master
slave1
slave2
slave3

Should I place this same slave file on all slave machines? Or should I delete the first line (master) and then put it in slave machines?

Regards.

+3

hadoop hadoop2

Code wrangler 29 oct. 14 at 10:36

source to share

2 answers

To fully understand the concept of multi-node cluster follow this link - http://bradhedlund.com/2011/09/10/understanding-hadoop-clusters-and-the-network/

and to implement multi-node cluster step by step follow this link - http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Let these links help you

+1

Kishore kumar suthar 29 oct. 14 at 10:49

source to share

Lauri Peltonen · Accepted Answer · 2014-10-29T11:29:17+0000

Yes, at least in small clusters these two should work in the main node.
Check answer 1. Master node can also have for example SecondaryNamenode and JobHistoryServer
No, the slave file is only on the master node. If there is a master node in the slaves file, it means that the master node also acts as a datanode. Especially in small clusters that are totally fine. The slaves file essentially specifies which of the nodes the datanode processes start.

Slave nodes should only run DataNode and NodeManager. But all this is handled by Hadoop if the configurations are correct - you can just check which processes are started after starting the cluster from the node master. The master node basically takes care of everything, and you "never" need to manually connect to slaves for any configuration.

My answer is for small clusters, perhaps in large "real" clusters the server responsibilities are even more separate.

Configuring Hadoop Cluster (Fully Distributed Mode)

More articles: