This is a fairly well-documented error and the fix is easy, but does anyone know why Hadoop datanode NamespaceIDs can get screwed up so easily or how Hadoop assigns the NamespaceIDs when it starts up the datanodes?
Here's the error:
2010-08-06 12:12:06,900 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Incompatible namespaceIDs in /Users/jchen/Data/Hadoop/dfs/data: namenode namespaceID = 773619367; datanode namespaceID = 2049079249
at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:233)
at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:148)
at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:298)
at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:216)
at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1283)
at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1238)
at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1246)
at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1368)
This seems to even happen for single node instances.
I was getting this too, and then I tried putting my configuration in
hdfs-site.xml
instead ofcore-site.xml
.Seems to stop and start without that error now.
[EDIT, 2010-08-13]
Actually this is still happening, and it is caused by formatting.
If you watch the VERSION files when you do a format, you'll see (at least I do) that the namenode gets assigned a new namespaceID, but the data node does not.
Quick solution is to delete the VERSION for the datanode before format.
[TIDE, 2010-08-13]
When I formatted my HDFS I also encountered this error. Apart from datanode not getting started, the jobtracker also won't start. For the datanode I manually changed the namespaceid; but for the jobtracker one has to create the /mapred/system (as hdfs user) directory and change its owner to mapred. The jobtracker should start running then after the format.
I got the following error "Incompatible namespaceIDs in /home/hadoop/data/dn",
I have four data nodes in the cluster, after starting start-dfs.sh only one datanode used to come up, SO the solution was to stop service in nn and jt and remove dn configuration drom hdfs-site in all datanodes, remove the dn file(/home/hadoop/data/dn) and format the namenode. Then again add the datanode properties in hdfs-site in all datanodes and format namenode onceagain. try starting services now all data nodes will be up surely
Namenode generates new namespaceID every time you format HDFS. I think this is possibly to differentiate current version and previous version. You can always rollback to previous version if something is not proper which may not be possible if namespaceID is not unique for every formatted instance.
NamespaceID also connects namenode and datanodes. Datanodes bind themselves to namenode through namespaceID
this problem is well explained and helped in the following fine guide