码迷,mamicode.com
首页 > 其他好文 > 详细

《OD大数据实战》hadoop伪分布式环境搭建

时间:2016-08-01 17:23:08      阅读:206      评论:0      收藏:0      [点我收藏+]

标签:

一、安装并配置Linux

8. 使用当前root用户创建文件夹,并给/opt/下的所有文件夹及文件赋予775权限,修改用户组为当前用户

 

mkdir -p /opt/modules
mkdir -p /opt/software
mkdir -p /opt/datas
mkdir -p /opt/tools
chmod 775 /opt/*
chown beifeng:beifeng /opt/*

最终效果如下:

[beifeng@beifeng-hadoop-02 opt]$ pwd
/opt
[beifeng@beifeng-hadoop-02 opt]$ ll
total 20
drwxrwxr-x.  5 beifeng beifeng 4096 Jul 30 00:13 clusterapps
drwxr-xr-x. 11 beifeng beifeng 4096 Jul 21 23:30 datas
drwxr-xr-x.  6 beifeng beifeng 4096 Jul 31 22:03 modules
drwxr-xr-x.  2 beifeng beifeng 4096 Jul 30 18:17 software
drwxr-xr-x.  2 beifeng beifeng 4096 Jul 10 20:26 tools

 

 

二、安装并配置JDK

1. 安装文件  

jdk-7u67-linux-x64.tar.gz

2. 解压

tar -zxvf jdk-7u67-linux-x64.tar.gz -C /opt/modules

3. 配置jdk

1)使用sudo配置/etc/profile,在文件尾加上以下配置

#JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.7.0_67
export PATH=$PATH:$JAVA_HOME/bin

2)配置完成后,使用su - root 切换到root用户,使用source命令生效配置。

source /etc/profile

3)验证jdk是否安装成功

[root@beifeng-hadoop-02 ~]# java -version
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)
[root@beifeng-hadoop-02 ~]# javac -version
javac 1.7.0_67

三、安装并配置hadoop

1. 安装文件

下载地址:http://archive.cloudera.com/cdh5/cdh/5/

下载: hadoop-2.5.0-cdh5.3.6.tar.gz

2. 解压

tar -zxvf hadoop-2.5.0-cdh5.3.6.tar.gz -C /opt/modules/cdh/

3. 配置伪分布式环境

参考文档: http://hadoop.apache.org/docs/r2.5.2/hadoop-project-dist/hadoop-common/ClusterSetup.html

cd /opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/etc/hadoop

修改/etc/profile,在文件尾增加以下配置:

#HADOOP_HOME
export HADOOP_HOME=/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6
export PATH=$PATH:$HADOOP_HOME/bin
export PATH=$PATH:$HADOOP_HOME/sbin
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib

建议使用远程sftp编辑工具,windows上可以使用notepad++,mac上推荐使用skEdit。

1)修改hadoop-evn.sh

export JAVA_HOME=/opt/modules/jdk1.7.0_67

2)修改yarn-env.sh

export JAVA_HOME=/opt/modules/jdk1.7.0_67

3)修改mapred-env.sh

export JAVA_HOME=/opt/modules/jdk1.7.0_67

4)修改core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://beifeng-hadoop-02:9000</value>
    </property>
     <property>
         <name>hadoop.tmp.dir</name>
         <value>/opt/modules/cdh/hadoop-2.5.0-cdh5.3.6/data/tmp</value>
     </property>
     <property>
          <name>hadoop.http.staticuser.user</name>
          <value>beifeng</value>
     </property>    
</configuration>

5)修改hdfs-site.xml

<configuration>

        <!-- 数据副本数,副本数等于所有datanode的总和 -->
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>

        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>beifeng-hadoop-02:50090</value>
        </property>

        <property>
                <name>dfs.permissions.enabled</name>
                <value>false</value>
        </property>
        
</configuration>

6)修改slaves

beifeng-hadoop-02

7)修改yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>

        <property>
                <name>yarn.resourcemanager.hostname</name>
                <value>beifeng-hadoop-02</value>
        </property>

        <!-- 是否启用日志聚集功能 -->
        <property>
                <name>yarn.log-aggregation-enable</name>
                <value>true</value>
        </property>

        <!-- 日志保留时间(单位为秒) -->
        <property>
                <name>yarn.log-aggregation.retain-seconds</name>
                <value>106800</value>
        </property>
</configuration>

8) 修改mapred-site.xml

<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

9)启动服务

(1)格式化hdfs

bin/hdfs namenode -format

(2)启动namenode和datanode

sbin/hadoop-daemon.sh start namenode
sbin/hadoop-daemon.sh start datanode

使用jps命令,或者web UI界面查看namenode是否已启动成功。

[beifeng@beifeng-hadoop-02 hadoop-2.5.0-cdh5.3.6]$ jps
82334 DataNode
82383 Jps
82248 NameNode

hdfs可视化界面: http://beifeng-hadoop-02:50070/dfshealth.html#tab-overview

(2)启动resourcemanager和nodemanager

sbin/yarn-daemon.sh start resourcemanager
sbin/yarn-daemon.sh start nodemanager

使用jps命令,或者web UI界面查看resourcemanager和nodemanager是否已成功启动

[beifeng@beifeng-hadoop-02 hadoop-2.5.0-cdh5.3.6]$ jps
82334 DataNode
82757 NodeManager
82874 Jps
82248 NameNode
82507 ResourceManager

yarn可视化界面: http://beifeng-hadoop-02:8088/cluster

(3)启动job历史服务器

sbin/mr-jobhistory-daemon.sh start historyserver

查看是否已成功启动:

历史服务器可视化界面:http://beifeng-hadoop-02:19888/

(4)启动secondarynamenode

sbin/hadoop-daemon.sh start secondarynamenode

查看是否已成功启动:

secondarynamenode可视化界面 http://beifeng-hadoop-02:50090/status.html

(5)所有相关服务停止命令

sbin/hadoop-daemon.sh stop namenode
sbin/hadoop-daemon.sh stop datanode
sbin/yarn-daemon.sh stop resourcemanager
sbin/yarn-daemon.sh stop nodemanager
sbin/mr-jobhistory-daemon.sh stop historyserver
sbin/hadoop-daemon.sh stop secondarynamenode

10)跑一个wordcount 验证环境搭建结果

文件系统shell:http://archive.cloudera.com/cdh5/cdh/5/hadoop-2.5.0-cdh5.3.6/hadoop-project-dist/hadoop-common/FileSystemShell.html

hdfs dfs -mkdir -p /user/beifeng/input

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0-cdh5.3.6.jar wordcount /user/beifeng/input /user/beifeng/output 

hdfs dfs -cat /user/beifeng/output/part-r-00000

 

《OD大数据实战》hadoop伪分布式环境搭建

标签:

原文地址:http://www.cnblogs.com/yeahwell/p/5726351.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!