Hadoop的数据节点无法启动投掷org.apache.hadoop.hdfs.server.com

2019-09-17 10:40发布

我有一些问题想开始在Hadoop的一个数据节点,从日志中我可以看到,数据节点启动两次(部分日志如下):

2012-05-22 16:25:00,369 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = master/192.168.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
************************************************************/
2012-05-22 16:25:00,375 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting DataNode
STARTUP_MSG:   host = master/192.168.0.1
STARTUP_MSG:   args = []
STARTUP_MSG:   version = 1.0.1
STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012
************************************************************/
2012-05-22 16:25:00,490 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-22 16:25:00,500 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-22 16:25:00,512 INFO org.apache.hadoop.metrics2.impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source MetricsSystem,sub=Stats registered.
2012-05-22 16:25:00,523 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Scheduled snapshot period at 10 second(s).
2012-05-22 16:25:00,524 INFO org.apache.hadoop.metrics2.impl.MetricsSystemImpl: DataNode metrics system started
2012-05-22 16:25:00,722 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-22 16:25:00,724 INFO org.apache.hadoop.metrics2.impl.MetricsSourceAdapter: MBean for source ugi registered.
2012-05-22 16:25:00,727 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2012-05-22 16:25:00,729 WARN org.apache.hadoop.metrics2.impl.MetricsSystemImpl: Source name ugi already exists!
2012-05-22 16:20:15,894 INFO org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
2012-05-22 16:20:16,008 ERROR org.apache.hadoop.hdfs.server.datanode.DataNode: java.io.IOException: Cannot lock storage /app/hadoop/tmp/dfs/data. The directory is already locked.
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:602)
        at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:455)
        at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:111)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.startDataNode(DataNode.java:385)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.<init>(DataNode.java:299)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.makeInstance(DataNode.java:1582)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.instantiateDataNode(DataNode.java:1521)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.createDataNode(DataNode.java:1539)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.secureMain(DataNode.java:1665)
        at org.apache.hadoop.hdfs.server.datanode.DataNode.main(DataNode.java:1682)

我在网上搜索,我发现这个问题 ,但使用的conf / HDFS-site.xml中我没有覆盖任何东西,也就是如下图所示,这样的Hadoop应该使用(如描述默认值这里 )不能引起任何故障的锁。 这是我的conf / HDFS-site.xml中:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>2</value>
    <description>Default block replication.
    The actual number of replications can be specified when the file is created.
    The default is used if replication is not specified in create time.
    </description>
  </property>
</configuration>

这是我的conf /核心-site.xml中:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
  <property>
    <name>hadoop.tmp.dir</name>
    <value>/app/hadoop/tmp</value>
    <description>A base for other temporary directories.</description>
  </property>

  <property>
    <name>fs.default.name</name>
    <value>hdfs://master:54310</value>
    <description>The name of the default file system.  A URI whose
    scheme and authority determine the FileSystem implementation.  The
    uri's scheme determines the config property (fs.SCHEME.impl) naming
    the FileSystem implementation class.  The uri's authority is used to
    determine the host, port, etc. for a filesystem.</description>
  </property>
</configuration>

这是Hadoop的/ conf目录/奴隶的内容:

master
slave

Answer 1:

  • 停止数据节点
  • 从DFS数据目录删除该文件in_use.lock
  • 位置并开始数据节点

它应该只是罚款



Answer 2:

另外,在您的HDFS-site.xml文件添加以下2个属性..

<property>
        <name>dfs.name.dir</name>
        <value>/some_path</value>
    </property>

    <property>
        <name>dfs.data.dir</name>
        <value>/some_path</value>
    </property>

其默认位置是该/tmp..because你失去在每次重新启动数据。



Answer 3:

我面临着类似的问题,但我看了后认为dfs.name.dir和dfs.data.dir应彼此不同。 我有两个是相同的,而改变这些值是不同的相互固定我的问题。



Answer 4:

我有同样的问题,并在外壳,我在那里运行测试解决了该问题,我的umask设置为0022。



Answer 5:

sudo rm -r /tmp/hadoop-[YourNameHere]/dfs/data/

并重新启动DFS。



文章来源: Hadoop datanode fails to start throwing org.apache.hadoop.hdfs.server.common.Storage: Cannot lock storage