Hadoop安装和使用

时间：2014-11-09 18:02:11 阅读：218 评论：0 收藏：0 [点我收藏+]

标签：style blog http io color ar os 使用 java

1、安装

1.1、下载hadoop-2.5.1.tar.gz

1.2、解压至安装目录

tar -zxv -f hadoop-2.5.1.tar.gz  -C ../soft/

1.3、配置hadoop相关配置文件

vim .bashrc

##添加JAVA配置
export JAVA_HOME=/usr/xuelu/java
export PATH=$PATH:$JAVA_HOME/bin

vim .bash_profile

# .bash_profile

# Get the aliases and functions
if [ -f ~/.bashrc ]; then
        . ~/.bashrc
fi

# User specific environment and startup programs

PATH=$PATH:$HOME/bin

#设置hadoop的环境变量
export HADOOP_HOME=/home/xuelul/soft/hadoop251
#设置maven的环境变量
export MAVEN_HOME=/usr/xuelul/maven
export ZOOKEEPER_HOME=/home/xuelu/soft/zoo346
PATH=$PATH:$HADOOP_HOME/bin:$MAVEN_HOME/bin:$ZOOKEEPER_HOME/bin
export PATH

source .bash_profile，使上述修改生效

修改hadoop自带的配置文件：

etc/hadoop/core-site.xml:

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

etc/hadoop/hdfs-site.xml:

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

Setup passphraseless ssh

Now check that you can ssh to the localhost without a passphrase:

  $ ssh localhost

If you cannot ssh to localhost without a passphrase, execute the following commands:

  $ ssh-keygen -t dsa -P ‘‘ -f ~/.ssh/id_dsa
  $ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys

hadoop运行命令如下：

#格式化文件系统:

   $ bin/hdfs namenode -format

#开启 NameNode daemon and DataNode daemon:

      $ sbin/start-dfs.sh

#The hadoop daemon log output is written to the $HADOOP_LOG_DIR directory (defaults to $HADOOP_HOME/logs).
    Browse the web interface for the NameNode; by default it is available at:
        NameNode - http://localhost:50070/
    Make the HDFS directories required to execute MapReduce jobs:

      $ bin/hdfs dfs -mkdir /user
      $ bin/hdfs dfs -mkdir /user/<username>

    Copy the input files into the distributed filesystem:

      $ bin/hdfs dfs -put etc/hadoop input

    Run some of the examples provided:

      $ bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.1.jar grep input output ‘dfs[a-z.]+‘

    Examine the output files:

    Copy the output files from the distributed filesystem to the local filesystem and examine them:

      $ bin/hdfs dfs -get output output
      $ cat output/*

    or View the output files on the distributed filesystem:

      $ bin/hdfs dfs -cat output/*

    When you‘re done, stop the daemons with:

      $ sbin/stop-dfs.sh

YARN on Single Node

You can run a MapReduce job on YARN in a pseudo-distributed mode by setting a few parameters and running ResourceManager daemon and NodeManager daemon in addition.

The following instructions assume that 1. ~ 4. steps of the above instructions are already executed.

Configure parameters as follows:

etc/hadoop/mapred-site.xml:

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

etc/hadoop/yarn-site.xml:

<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>

Start ResourceManager daemon and NodeManager daemon:
```
  $ sbin/start-yarn.sh
```
Browse the web interface for the ResourceManager; by default it is available at:
- ResourceManager - http://localhost:8088/
Run a MapReduce job.
When you‘re done, stop the daemons with:
```
  $ sbin/stop-yarn.sh
```

Hadoop安装和使用

标签：style blog http io color ar os 使用 java

原文地址：http://www.cnblogs.com/xuelu/p/4085573.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行