Hadoop: Datanodes available: 0 (total 0, 0 dead)

Every time I run:

hadoop dfsadmin -report

      

I am getting the following output:

Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: �%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

      

  • There is no data directory in my dfs / folder.
  • A lock file exists in this folder: in_use.lock
  • The wizard, job tracker, and data nodes are working fine.
+3


source to share


9 replies


I had exactly the same problem and when I checked the datanodes data logs there were a lot could not connect to master:9000

and when I checked the ports on master via netstat -ntlp

I got this in the output:

tcp 0 0 127.0.1.1:9000 ...

      

I realized that I need to change the hostname or change master

in all configurations. I decided to do the first reason, it seems much easier. so I changed /etc/hosts

and changed 127.0.1.1 master

to 127.0.1.1 master-machine

and added an entry at the end of the file like this:



192.168.1.1 master

      

Then I changed master

to master-machine

in /etc/hostname

and restarted the machine. The problem disappeared.

+1


source


um ...

Have you checked your firewall?



When I use hasoop I disable the firewall (iptables -F, on all nodes)

and then try again.

0


source


Please check the data logs. He will log errors if he is unable to report an appointment. If you post these errors, people will be able to help.

0


source


This happened to us when we restarted the cluster. But after a while, datanodes were automatically detected. This is possibly due to the delay time property of the block report.

0


source


Usually datanode gets namespace errors. So remove the dir name from master and remove the dir data from datanodes. Now format the datanode and try start-dfs. It usually takes some time for the report to reflect all the data. Even I was getting 0 datanodes, but after a while the master discovers slaves.

0


source


I had the same problem and just solved it.

/ etc / hosts of all nodes should look like this:

127.0.0.1 localhost xxx.xxx.xxx.xxx master xxx.xxx.xxx.xxx slave-1 xxx.xxx.xxx.xxx slave-2

0


source


Just solve the problem by following these steps -

  1. Make sure the IP addresses for master and slave are correct in the file /etc/hosts

  2. If you really don't need the data, stop-dfs.sh

    delete all directories data

    in master / slave nodes, then run hdfs namenode -format

    and start-dfs.sh

    . This should recreate hdfs and fix the problem
0


source


Just formatting the namenod didn't work for me. So I checked the logs in $HADOOP_HOME/logs

. In secondarynamenode

I encountered this error:

ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint
java.io.IOException: Inconsistent checkpoint fields.
LV = -64 namespaceID = 2095041698 cTime = 1552034190786 ; clusterId = CID-db399b3f-0a68-47bf-b798-74ed4f5be097 ; blockpoolId = BP-31586866-127.0.1.1-1552034190786.
Expecting respectively: -64; 711453560; 1550608888831; CID-db399b3f-0a68-47bf-b798-74ed4f5be097; BP-2041548842-127.0.1.1-1550608888831.
    at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:143)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:550)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:360)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:325)
    at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:482)
    at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:321)
    at java.lang.Thread.run(Thread.java:748)

      

So I stopped hadoop and then specifically formatted the specified cluster id:

hdfs namenode -format -clusterId CID-db399b3f-0a68-47bf-b798-74ed4f5be097

      

This fixed the problem.

0


source


There is another unclear reason why this can happen: your datodode did not start as expected, but everything else worked.

In my case, when looking at the log, I found that the associated port 510010 was already in use by SideSync (for macOS). I found this through sudo lsof -iTCP -n -P|grep 0010

, but you can use similar methods to determine what has already taken your well known data node port.

Disconnecting and restarting fixed the issue.

Also, if you installed Hadoop / Yarn as root, but you have data directories in separate home directories, and then try to run it as a separate user, you will need to make the data node directory public.

0


source







All Articles