标签:style blog http java color strong
hadoop2分布式安装后总是报这个bug
2014-07-06 08:22:40,506 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to hadoop3/192.168.56.203:9000 java.io.IOException: Cluster IDs not matched: dn cid=CID-dfdd94b3-ce12-4465-ab52-6a02b7e7dba1 but ns cid=CID-48bee819-05bb-4ad4-9f25-1aa903fbfcb7; bpid=BP-500128720-192.168.56.202-1404563273292 at org.apache.hadoop.hdfs.server.datanode.DataNode.setClusterId(DataNode.java:291) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:896) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,507 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to hadoop2/192.168.56.202:9000 java.io.IOException: Cluster IDs not matched: dn cid=CID-dfdd94b3-ce12-4465-ab52-6a02b7e7dba1 but ns cid=CID-48bee819-05bb-4ad4-9f25-1aa903fbfcb7; bpid=BP-500128720-192.168.56.202-1404563273292 at org.apache.hadoop.hdfs.server.datanode.DataNode.setClusterId(DataNode.java:291) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:896) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,517 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to hadoop2/192.168.56.202:9000 2014-07-06 08:22:40,518 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to hadoop3/192.168.56.203:9000 2014-07-06 08:22:40,518 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:859) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,518 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned) 2014-07-06 08:22:40,518 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:861) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,529 INFO org.apache.hadoop.hdfs.server.common.Storage: Data-node version: -55 and name-node layout version: -56 2014-07-06 08:22:40,632 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /usr/local/hadoop/tmp/dfs/data/in_use.lock acquired by nodename 2624@hadoop2 2014-07-06 08:22:40,680 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to hadoop0/192.168.56.200:9000 java.io.IOException: Incompatible clusterIDs in /usr/local/hadoop/tmp/dfs/data: namenode clusterID = CID-dfdd94b3-ce12-4465-ab52-6a02b7e7dba1; datanode clusterID = CID-48bee819-05bb-4ad4-9f25-1aa903fbfcb7 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:225) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:929) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:900) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,681 INFO org.apache.hadoop.hdfs.server.common.Storage: Data-node version: -55 and name-node layout version: -56 2014-07-06 08:22:40,689 ERROR org.apache.hadoop.hdfs.server.common.Storage: It appears that another namenode 2624@hadoop2 has already locked the storage directory 2014-07-06 08:22:40,691 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /usr/local/hadoop/tmp/dfs/data. The directory is already locked 2014-07-06 08:22:40,691 WARN org.apache.hadoop.hdfs.server.common.Storage: Ignoring storage directory /usr/local/hadoop/tmp/dfs/data due to an exception java.io.IOException: Cannot lock storage /usr/local/hadoop/tmp/dfs/data. The directory is already locked at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:674) at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:493) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:186) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:929) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:900) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,691 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to hadoop0/192.168.56.200:9000 2014-07-06 08:22:40,691 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to hadoop1/192.168.56.201:9000 java.io.IOException: All specified directories are not accessible or do not exist. at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:217) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:929) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:900) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,692 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to hadoop1/192.168.56.201:9000 2014-07-06 08:22:40,818 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:859) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:40,818 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned) 2014-07-06 08:22:40,818 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:861) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:662) 2014-07-06 08:22:42,820 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2014-07-06 08:22:42,830 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0 2014-07-06 08:22:42,843 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG:
解决方法:
dataNode 无法启动是配置过程中最常见的问题,主要原因是多次format namenode 造成namenode 和datanode的clusterID不一致。建议查看datanode上面的log信息。解决办法:修改每一个datanode上面的CID(位于dfs/data/current/VERSION文件夹中)使两者一致。
另:
1. clusterID不一致,namenode的cid和datanode的cid不一致,导致的原因是对namenode进行format的之后,datanode不会进行format,所以datanode里面的cid还是和format之前namenode的cid一样,解决办法是删除datanode里面的dfs.datanode.data.dir目录和tmp目录,然后再启动start-dfs.sh
2.即使删除iptables之后,仍然报Datanode denied communication with namenode: DatanodeRegistration错误,参考文章http://stackoverflow.com/questions/17082789/cdh4-3exception-from-the-logs-after-start-dfs-sh-datanode-and-namenode-star,可以知道需要把集群里面每个houst对应的ip写入/etc/hosts文件就能解决问题。
20140709 datanode bug,布布扣,bubuko.com
标签:style blog http java color strong
原文地址:http://www.cnblogs.com/muzhongjiang/p/3833477.html