码迷,mamicode.com
首页 > 系统相关 > 详细

Ubuntu server 14.04下安装hadoop-2.6.4 集群机

时间:2016-07-13 17:36:50      阅读:745      评论:0      收藏:0      [点我收藏+]

标签:

需要的环境:
Ubuntu server 14.04 四台
Windows 7 操作系统(内存尽量大)
Xshell 软件 安装在Windows 7操作系统上
Hadoop-2.6.4 软件(编译好)
参考文档:
Hadoop 集群安装:
(1) http://www.linuxidc.com/Linux/2015-01/111463.htm
(2) http://blog.csdn.net/stark_summer/article/details/42424279
(3) http://www.w2bc.com/Article/19645
(4) http://www.aboutyun.com/forum.php?mod=viewthread&tid=6961
(5) http://www.aboutyun.com/thread-10572-1-1.html
Ubuntu 系统安装:
http://blog.csdn.net/u014716068/article/details/51829084
Hadoop 2.6.4编译:
http://blog.csdn.net/u014716068/article/details/51842105
SSH免密码登录:
http://blog.csdn.net/ldl22847/article/details/8281639
http://jingyan.baidu.com/article/60ccbceb02bd4264cab197b9.html
hostname和hosts的区别:
http://www.cnblogs.com/kerrycode/p/3595724.html

安装前准备

JDK的安装:jdk的安装在之前我想给大家直接发链接过去,但是我在安装的时候发现出现了好多错误,感觉还是要写出来才安心。在四台机器上分别指向这些命令即可。
第一个是安装jdk7,在终端输入:

root@slaver01:/etc/apt# apt-cache search openjdk*
default-jdk - Standard Java or Java compatible Development Kit
default-jdk-doc - Standard Java or Java compatible Development Kit (documentation)
default-jre - Standard Java or Java compatible Runtime
default-jre-headless - Standard Java or Java compatible Runtime (headless)
icedtea-7-jre-jamvm - Alternative JVM for OpenJDK, using JamVM
icedtea-7-plugin - web browser plugin based on OpenJDK and IcedTea to execute Java applets
openjdk-7-dbg - Java runtime based on OpenJDK (debugging symbols)
openjdk-7-demo - Java runtime based on OpenJDK (demos and examples)
openjdk-7-doc - OpenJDK Development Kit (JDK) documentation
openjdk-7-jdk - OpenJDK Development Kit (JDK)
openjdk-7-source - OpenJDK Development Kit (JDK) source files
java-package - Utility for creating Java Debian packages
freemind - Java Program for creating and viewing Mindmaps
icedtea-6-jre-cacao - Alternative JVM for OpenJDK, using Cacao
icedtea-6-jre-jamvm - Alternative JVM for OpenJDK, using JamVM
icedtea-6-plugin - web browser plugin based on OpenJDK and IcedTea to execute Java applets
jtreg - Regression Test Harness for the OpenJDK platform
jvm-7-avian-jre - lightweight virtual machine using the OpenJDK class library
libreoffice - office productivity suite (metapackage)
openjdk-6-dbg - Java runtime based on OpenJDK (debugging symbols)
openjdk-6-demo - Java runtime based on OpenJDK (demos and examples)
openjdk-6-doc - OpenJDK Development Kit (JDK) documentation
openjdk-6-jdk - OpenJDK Development Kit (JDK)
openjdk-6-jre-lib - OpenJDK Java runtime (architecture independent libraries)
openjdk-6-jre-zero - Alternative JVM for OpenJDK, using Zero/Shark
openjdk-6-source - OpenJDK Development Kit (JDK) source files
openjdk-7-jre-lib - OpenJDK Java runtime (architecture independent libraries)
openjdk-7-jre-zero - Alternative JVM for OpenJDK, using Zero/Shark
uwsgi-app-integration-plugins - plugins for integration of uWSGI and application
uwsgi-plugin-jvm-openjdk-6 - Java plugin for uWSGI (OpenJDK 6)
uwsgi-plugin-jvm-openjdk-7 - Java plugin for uWSGI (OpenJDK 7)
uwsgi-plugin-jwsgi-openjdk-6 - JWSGI plugin for uWSGI (OpenJDK 6)
uwsgi-plugin-jwsgi-openjdk-7 - JWSGI plugin for uWSGI (OpenJDK 7)
openjdk-7-jre - OpenJDK Java runtime, using Hotspot JIT
openjdk-7-jre-headless - OpenJDK Java runtime, using Hotspot JIT (headless)
openjdk-6-jre - OpenJDK Java runtime, using Hotspot JIT
openjdk-6-jre-headless - OpenJDK Java runtime, using Hotspot JIT (headless)

这么多,我们选择哪个呢,我们选择的是第十个:

apt-get install openjdk-7-jdk -y

如果你的源足够好的话,那么这个命令会一直执行到结束,然后我们可以查看一下这个jdk是否安装成功:

以上还有省略。。。
done.
Setting up libatk-wrapper-java-jni:amd64 (0.30.4-4) ...
Processing triggers for libc-bin (2.19-0ubuntu6) ...
Processing triggers for ca-certificates (20130906ubuntu2) ...
Updating certificates in /etc/ssl/certs... 0 added, 0 removed; done.
Running hooks in /etc/ca-certificates/update.d....
done.
done.
root@slaver02:/etc/apt# java -version
java version "1.7.0_101"
OpenJDK Runtime Environment (IcedTea 2.6.6) (7u101-2.6.6-0ubuntu0.14.04.1)
OpenJDK 64-Bit Server VM (build 24.95-b01, mixed mode)

如果出现java的版本信息后,证明安装成功。但是事实有时候很残酷啊,就是不让你成功,我的问题是出在了源上,ubuntu的源上,总是无故上报一些令我无法排除的问题。最后我用学下的源才解决了问题。
Java安装好后,查找一些java的路径,需要配置Java路径:

root@slaver01:~# find / -name ‘jre*‘
/usr/lib/jvm/java-7-openjdk-amd64/jre
或者是:
root@slaver01:~# which java 
/usr/lib/jvm/java-7-openjdk-amd64/bin/java
也能找到java的命令位置。
root@slaver01:~# vim /etc/profile

将对应的路径填写到JAVA_HOME下,然后,把这一段拷贝到/etc/profile 文件中,添加到最前面最后面都行:

export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

修改hostname:
之前开题中给大家介绍的一个修改hostname的博文,适度的看,别搞晕了,我们现在只要知道怎么配置即可:

root@slaver01:~# vim /etc/hostname
第一台机器中,这个文件输入:
master
第二台机器中,这个文件输入:
slaver01
第三台机器中,这个文件输入:
slaver02
第四台机器中,这个文件输入:
slaver03
(以上配置这个暂时理解是识别每台机器吧!)

然后我们保存即可。最后我们在重启一下机器,就可以看到编辑的效果啦。

修改/etc/hosts:
在hosts文件中,添加以下内容:

root@slaver01:~# vim /etc/hosts
将以下内容全部拷贝到:
192.168.13.138   master
192.168.13.139   slaver01
192.168.13.140   slaver02
192.168.13.141   slaver03

SSH免密码登录:
查看一下ssh是否安装了:

root@slaver01:~# ssh localhost 
The authenticity of host ‘localhost (::1)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘localhost‘ (ECDSA) to the list of known hosts.
root@localhost‘s password: 
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Fri Jul  8 18:56:20 EDT 2016

  System load:  0.08              Processes:           360
  Usage of /:   9.7% of 18.32GB   Users logged in:     1
  Memory usage: 17%               IP address for eth0: 192.168.13.139
  Swap usage:   0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

*** System restart required ***
Last login: Fri Jul  8 18:56:24 2016 from 192.168.13.1
root@slaver01:~# exit
logout
Connection to localhost closed.

以上信息代表我安装了ssh服务,因为当时装机器的时候,我选了一个ssh服务,所现在才能用ssh服务,如果没有安装的话,我们ssh的话就会拒绝访问。

安装ssh服务器:
root@master:~# apt-get install openssh-server
查看安装版本:
root@slaver01:~# ssh -V
OpenSSH_6.6p1 Ubuntu-2ubuntu1, OpenSSL 1.0.1f 6 Jan 2014
启动服务:
root@master:~# /etc/init.d/ssh start
查看ssh服务:
root@slaver01:~# ps -e | grep ssh
  1175 ?        00:00:00 sshd
  1425 ?        00:00:04 sshd
 16325 ?        00:00:00 sshd

以上查看方式可应用于这四台机器。
在这四台机器上,分别执行:

这是之前root下的.ssh文件中只有一个文件:
root@slaver01:~# ls -a
.  ..  .bash_history  .bashrc  .cache  .profile  .ssh  .viminfo
root@slaver01:~# cd .ssh/
root@slaver01:~/.ssh# ls
known_hosts
root@slaver01:~/.ssh# pwd
/root/.ssh
root@slaver01:~/.ssh# 
安装免密码登录(每次出来一行就按enter键一下,知道结束):
root@slaver01:~/.ssh# ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
07:95:99:79:95:0b:42:62:8f:6c:15:5e:2e:13:bd:1d root@slaver01
The key‘s randomart image is:
+--[ RSA 2048]----+
|        o.*B.... |
|       o B*++ E  |
|        = =o.+ o |
|       . . o. o  |
|        S .      |
|         .       |
|                 |
|                 |
|                 |
+-----------------+
root@slaver01:~/.ssh#
这是安装之后的文件,出现了两个,一会我们要用到id_rsa.pub这个公钥:
root@slaver01:~/.ssh# ls
id_rsa  id_rsa.pub  known_hosts
root@slaver01:~/.ssh#

在每台机器上都执行以上方法获得免密码登录后,此时需要将id_rsa.pub这个文件备份一下:

root@master:~/.ssh# cd /root/.ssh
root@master:~/.ssh# pwd
/root/.ssh
root@master:~/.ssh# ls
id_rsa  id_rsa.pub  known_hosts
root@master:~/.ssh#
在master机器上执行:
root@master:~/.ssh# cat id_rsa.pub >> master
slaver01机器上执行:
root@slaver01:~/.ssh# cat id_rsa.pub >> slaver01
root@slaver01:~/.ssh# scp slaver01 root@192.168.13.138:/root/.ssh/
The authenticity of host ‘192.168.13.138 (192.168.13.138)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes   // 在此输入yes
Warning: Permanently added ‘192.168.13.138‘ (ECDSA) to the list of known hosts.
root@192.168.13.138‘s password:   // 在此输入密码,输入时看不见,enter
slaver01                                                                 100%  395     0.4KB/s   00:00

slaver02机器上执行:
root@slaver02:~/.ssh# cat id_rsa.pub >> slaver02
root@slaver02:~/.ssh# scp slaver02 root@192.168.13.138:/root/.ssh/
The authenticity of host ‘192.168.13.138 (192.168.13.138)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes   // 在此输入yes
Warning: Permanently added ‘192.168.13.138‘ (ECDSA) to the list of known hosts.
root@192.168.13.138‘s password:   // 在此输入密码,输入时看不见,enter
slaver02                                                                 100%  395     0.4KB/s   00:00    
root@slaver02:~/.ssh#
slaver03机器上执行:
root@slaver03:~/.ssh# cat id_rsa.pub >> slaver03
root@slaver03:~/.ssh# scp slaver03 root@192.168.13.138:/root/.ssh/
The authenticity of host ‘192.168.13.138 (192.168.13.138)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes   // 在此输入yes
Warning: Permanently added ‘192.168.13.138‘ (ECDSA) to the list of known hosts.
root@192.168.13.138‘s password:   // 在此输入密码,输入时看不见,enter
slaver03                                                                 100%  395     0.4KB/s   00:00    
root@slaver03:~/.ssh#

在master机器中:

root@master:~/.ssh# pwd
/root/.ssh
root@master:~/.ssh# ls 
id_rsa  id_rsa.pub  known_hosts  master  slaver01  slaver02  slaver03
root@master:~/.ssh# cat master slaver01 slaver02 slaver03 >> authorized_keys
root@master:~/.ssh#

将authorized_keys 文件拷贝到其他机器中的/etc/.ssh/文件下:
root@master:~/.ssh# scp authorized_keys root@192.168.13.139:/root/.ssh/
The authenticity of host ‘192.168.13.139 (192.168.13.139)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.13.139‘ (ECDSA) to the list of known hosts.
root@192.168.13.139‘s password:   // 输入密码
authorized_keys                                                          100% 1578     1.5KB/s   00:00    
root@master:~/.ssh# 

root@master:~/.ssh# scp authorized_keys root@192.168.13.140:/root/.ssh/
The authenticity of host ‘192.168.13.140 (192.168.13.140)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.13.140‘ (ECDSA) to the list of known hosts.
root@192.168.13.140‘s password:   // 输入密码
authorized_keys                                                          100% 1578     1.5KB/s   00:00    
root@master:~/.ssh#

root@master:~/.ssh# scp authorized_keys root@192.168.13.141:/root/.ssh/
The authenticity of host ‘192.168.13.141 (192.168.13.141)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘192.168.13.141‘ (ECDSA) to the list of known hosts.
root@192.168.13.141‘s password:   // 输入密码
authorized_keys                                                          100% 1578     1.5KB/s   00:00    
root@master:~/.ssh#

此时,我们的ssh免密码登录就完成了,我们来测试一下:

ssh 到slaver01:
root@master:~/.ssh# ssh slaver01
The authenticity of host ‘slaver01 (192.168.13.139)‘ can‘t be established.
ECDSA key fingerprint is bd:74:0f:7a:84:53:ac:29:e5:a9:7e:85:07:bc:ef:34.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added ‘slaver01‘ (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-24-generic x86_64)

 * Documentation:  https://help.ubuntu.com/

  System information as of Fri Jul  8 20:55:09 EDT 2016

  System load:  0.0               Processes:           361
  Usage of /:   9.7% of 18.32GB   Users logged in:     1
  Memory usage: 20%               IP address for eth0: 192.168.13.139
  Swap usage:   0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

*** System restart required ***
Last login: Fri Jul  8 19:17:50 2016 from localhost
root@slaver01:~# exit
logout
Connection to slaver01 closed.
root@master:~/.ssh# 
(注意本机都是第一次登录到其他机器,所以一般一些其他信息可以忽略,第二次就好点了)

创建数据存储文件(因为是在虚拟机中玩测试版本,所以为了好删东西,暂时将我们Hadoop中的数据存储路径和Hadoop的存放的路径放置在一块):

root@master:/usr/hadoop-2.6.4# pwd
/usr/hadoop-2.6.4
root@master:/usr/hadoop-2.6.4# ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share
root@master:/usr/hadoop-2.6.4# mkdir dfs
root@master:/usr/hadoop-2.6.4# mkdir dfs/name
root@master:/usr/hadoop-2.6.4# mkdir dfs/data
root@master:/usr/hadoop-2.6.4# mkdir tmp
root@master:/usr/hadoop-2.6.4# ls
bin  dfs  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share  tmp

配置hadoop文件

以下都是在master中配置的。

来到/usr目录下,我们将编译好的hadoop-2.6.4.tar.gz解压到当前路径:
root@master:/usr# pwd 
/usr
root@master:/usr# ls 
bin  games  hadoop-2.6.4.tar.gz  include  lib  local  sbin  share  src
root@master:/usr# tar -zxvf hadoop-2.6.4.tar.gz
root@master:/usr# cd hadoop-2.6.4/
root@master:/usr/hadoop-2.6.4# ls
bin  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share
root@master:/usr/hadoop-2.6.4/etc/hadoop# pwd
/usr/hadoop-2.6.4/etc/hadoop
root@master:/usr/hadoop-2.6.4/etc/hadoop#

配置文件,这个路径下有好多文件,但是我们需要配置的所有文件如下:

/usr/hadoop-2.6.4/etc/hadoop/core-site.xml
/usr/hadoop-2.6.4/etc/hadoop/hdfs-site.xml
/usr/hadoop-2.6.4/etc/hadoop/yarn-site.xml
/usr/hadoop-2.6.4/etc/hadoop/mapred-site.xml
/usr/hadoop-2.6.4/etc/hadoop/slaves
/usr/hadoop-2.6.4/etc/hadoop/hadoop-env.sh
/usr/hadoop-2.6.4/etc/hadoop/yarn-env.sh

我们发现写配置文件都是一些脚本文件,由一个大标签和若干小标签组成,小标签中、、,三个值为一个配置参数,name是具体的参数名字,value是需要配置的值,description是对这个参数的描述,以及配置这个参数的方法。
配置core-site.xml文件,主要配置core的项,例如hdfs和mapreduce常用的i/o设置等:

root@master:/usr/hadoop-2.6.4/etc/hadoop# vim core-site.xml
填写如下信息(注意里面有了<configuration></configuration>这两个标签,直接在标签中填写即可):
<configuration>

    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://master:8020</value>
    </property>

    <property>
         <name>io.file.buffer.size</name>
         <value>131072</value>
    </property>

    <property>
         <name>hadoop.tmp.dir</name>
         <value>file:/usr/hadoop-2.6.4/tmp</value>
    </property>

</configuration>

注:第一个是namenode的URL地址和端口,分别是master和8020,master也可以用IP,这个配置Hadoop默认的文件系统的URL地址,8020是RPC默认的端口,网上有的是9000,版本信息。第二个参数是在序列文件中的读/写缓存大小。我们发现这个数是4096的32倍,在X86系统中的硬件页面的32倍。第三个参数是hadoop文件系统依赖的基本配置,很多配置路径都依赖它,它的默认位置在/tmp/{$user}下面。防止断电重启/tmp文件丢失。我们在准备阶段已经把这个路径配置到我们Hadoop路径下了,这个不用担心了。(Waring:那个端口不能瞎改,我试过改成50070,最后namenode直接起不来了,所以还是默认改成8020吧!)

配置hdfs-site.xml文件,主要是hadoop守护进程的配置项,包括namenode、辅助namenode和datanode等:

root@master:/usr/hadoop-2.6.4/etc/hadoop# vim hdfs-site.xml
填写如下信息(注意里面有了<configuration></configuration>这两个标签,直接在标签中填写即可):
<configuration>

    <property>
        <name>dfs.namenode.secondary.http-address</name>
        <value>master:9001</value>
    </property>

    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/usr/hadoop-2.6.4/dfs/name</value>
    </property>

    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/usr/hadoop-2.6.4/dfs/data</value>
    </property>

    <property>
        <name>dfs.replication</name>
        <value>3</value>
    </property>

    <property>
        <name>dfs.webhdfs.enabled</name>
        <value>true</value>
    </property>
    <property>
        <name>dfs.permissions</name>        
        <value>false</value>        
    </property>
</configuration>

注:第一个参数是secondarynamenode的HTTP服务器地址和端口,有的网上是9000。第二个参数是本地文件系统的DFS命名节点应存储的名称表。第三个参数是本地文件系统上的DFS数据节点应存储的块。第四个参数是默认复制的块数。在创建文件时,可以指定复制的实际数量。如果未指定复制在创建时使用的默认,此处填写是3块。第五个参数是启用WebHDFS(REST API)在Namenodes和的Datanode。第六个参数是如果“真”,启用权限在HDFS检查,如果是“假”,则不启用权限,这个在后面eclipse访问Hadoop时需要设置,所以提前设置了,为了后期的Windows开发。

配置yarn-site.xml文件:

root@master:/usr/hadoop-2.6.4/etc/hadoop# vim yarn-site.xml
填写如下信息(注意里面有了<configuration></configuration>这两个标签,直接在标签中填写即可):
<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
               <name>yarn.nodemanager.aux-services</name>
               <value>mapreduce_shuffle</value>
        </property>
        <property>                                                                
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
               <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
               <name>yarn.resourcemanager.address</name>
               <value>master:8032</value>
       </property>
       <property>
               <name>yarn.resourcemanager.scheduler.address</name>
               <value>master:8030</value>
       </property>
       <property>
            <name>yarn.resourcemanager.resource-tracker.address</name>
             <value>master:8031</value>
      </property>
      <property>
              <name>yarn.resourcemanager.admin.address</name>
               <value>master:8033</value>
       </property>
       <property>
               <name>yarn.resourcemanager.webapp.address</name>
               <value>master:8088</value>
       </property>

</configuration>

注:第一个参数有效的服务名称应该只包含A-ZA-Z0-9_,并且不能以数字开头。第二个参数是:指定类的地址。第三个参数是在RM应用程序管理器界面的地址。第四个参数是调度接口的地址。第五个参数是resource-tracker的地址。第六个参数是在RM管理界面的地址。第六个参数是在RM Web应用程序的HTTP地址。
配置mapred-site.xml文件,mapreduce守护进程的配置项,包括jobtracker和tasktracker:

root@master:/usr/hadoop-2.6.4/etc/hadoop# vim mapred-site.xml
填写如下信息(注意里面有了<configuration></configuration>这两个标签,直接在标签中填写即可):
<configuration>

    <property>                                                                  
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

    <property>
        <name>mapreduce.jobhistory.address</name>
        <value>master:10020</value>
    </property>

     <property>
         <name>mapreduce.jobhistory.webapp.address</name>
         <value>master:19888</value>
     </property>

</configuration>

第一个参数执行MapReduce任务运行时框架。可以是本地的,经典的或yarn之一。第二个参数是MapReduce的JobHistory服务器IPC主机和端口。第三个参数是MapReduce的JobHistory服务器Web UI主机和端口。
配置slaves文件,记录运行datanode的机器列表:

root@master:/usr/hadoop-2.6.4/etc/hadoop# vim slaves
master
slaver01
slaver02
slaver03

配置hadoop-env.sh文件,记录脚本要用的环境变量,以运行hadoop:

。
。
。
##在这个位置配置我们的java路径
# The java implementation to use.
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
。
。
。

配置yarn-env.sh文件,记录脚本要用的环境变量,以运行hadoop:

。
。
。
##在这个位置配置我们java路径
# some Java parameters
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64
。
。
。

启动集群

将配置好的Hadoop拷贝到其他宿主机上去:

拷贝到slaver01上去:
root@master:/usr# scp -r hadoop-2.6.4 root@slaver01:/usr/
拷贝到slaver02上去:
root@master:/usr# scp -r hadoop-2.6.4 root@slaver02:/usr/
拷贝到slaver03上去:
root@master:/usr# scp -r hadoop-2.6.4 root@slaver03:/usr/
查看一下:
root@slaver01:~/.ssh# cd /usr/
root@slaver01:/usr# ls
bin  games  hadoop-2.6.4  include  lib  local  sbin  share  src
root@slaver01:/usr# cd hadoop-2.6.4/
root@slaver01:/usr/hadoop-2.6.4# ls
bin  dfs  etc  include  lib  libexec  LICENSE.txt  NOTICE.txt  README.txt  sbin  share  tmp
root@slaver01:/usr/hadoop-2.6.4#
以此类推。

配置命令文件,使启动时敲打命令更加方便(四台机器都是):

root@master:/usr# vim /etc/profile
添加PATH后面这一项:
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:/usr/hadoop-2.6.4/bin:/usr/hadoop-2.6.4/sbin:$PATH
root@master:~/.ssh# source /etc/profile
以此类推,后面几台也是这样的。

格式化hdfs:

root@master:/usr# hadoop namenode -format
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/07/09 00:23:24 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.13.138
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.4
STARTUP_MSG:   classpath = (此处省略一万字)
。
。
。
16/07/09 00:23:32 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/07/09 00:23:32 INFO util.ExitUtil: Exiting with status 0
16/07/09 00:23:32 INFO namenode.NameNode: SHUTDOWN_MSG: 
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at master/192.168.13.138
************************************************************/

启动Hadoop集群:

启动master和namenode节点:
root@master:/usr/hadoop-2.6.4# start-dfs.sh 
Starting namenodes on [master]
master: starting namenode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-namenode-master.out
slaver03: starting datanode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-datanode-slaver03.out
slaver02: starting datanode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-datanode-slaver02.out
master: starting datanode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-datanode-master.out
slaver01: starting datanode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-datanode-slaver01.out
Starting secondary namenodes [master]
master: starting secondarynamenode, logging to /usr/hadoop-2.6.4/logs/hadoop-root-secondarynamenode-master.out

查看一下:
root@master:/usr/hadoop-2.6.4# jps
4400 SecondaryNameNode
4512 Jps
1315 Bootstrap
4077 NameNode
4217 DataNode

启动namenode和slaver*的守护进程和resourcemanager
root@master:/usr/hadoop-2.6.4# start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /usr/hadoop-2.6.4/logs/yarn-root-resourcemanager-master.out
slaver01: starting nodemanager, logging to /usr/hadoop-2.6.4/logs/yarn-root-nodemanager-slaver01.out
slaver03: starting nodemanager, logging to /usr/hadoop-2.6.4/logs/yarn-root-nodemanager-slaver03.out
master: starting nodemanager, logging to /usr/hadoop-2.6.4/logs/yarn-root-nodemanager-master.out
slaver02: starting nodemanager, logging to /usr/hadoop-2.6.4/logs/yarn-root-nodemanager-slaver02.out

查看一下:
root@master:/usr/hadoop-2.6.4# jps
4400 SecondaryNameNode
4677 NodeManager
4554 ResourceManager
4710 Jps
1315 Bootstrap
4077 NameNode
4217 DataNode
root@master:/usr/hadoop-2.6.4#

查看Hadoop的服务:

在master上:
root@master:/usr/hadoop-2.6.4# jps
4400 SecondaryNameNode
4677 NodeManager
4554 ResourceManager
4710 Jps
1315 Bootstrap
4077 NameNode
4217 DataNode
root@master:/usr/hadoop-2.6.4# 

在slaver*上:
root@slaver01:/usr/hadoop-2.6.4/etc/hadoop# jps
18081 Jps
17863 DataNode
17989 NodeManager
root@slaver01:/usr/hadoop-2.6.4/etc/hadoop#

root@slaver02:~/.ssh# jps
16945 DataNode
17169 Jps
17072 NodeManager
root@slaver02:~/.ssh#

root@slaver02:~/.ssh# jps
16945 DataNode
17169 Jps
17072 NodeManager
root@slaver02:~/.ssh#

本地操作Hadoop

首先创建一个测试文件夹,然后在这个测试文件夹中存放一个本地的文件到这个测试文件夹中,存放成功后查看文件的大小是否一致:

root@master:/usr/hadoop-2.6.4/etc# cd /root/
root@master:~# ls
lantern-installer-beta-64-bit.deb  xyj
root@master:~# hadoop fs -mkdir  /xyjdir
root@master:~# hadoop fs -ls /
Found 1 items
drwxr-xr-x   - root supergroup          0 2016-07-10 21:27 /xyjdir
root@master:~# hadoop fs -put lantern-installer-beta-64-bit.deb /xyjdir/
root@master:~# hadoop fs -ls /
Found 1 items
drwxr-xr-x   - root supergroup          0 2016-07-10 21:27 /xyjdir
root@master:~# hadoop fs -ls /xyjdir/
Found 1 items
-rw-r--r--   3 root supergroup    4351032 2016-07-10 21:27 /xyjdir/lantern-installer-beta-64-bit.deb
root@master:~# ll lantern-installer-beta-64-bit.deb 
-rw-r--r-- 1 root root 4351032 Apr 24 06:19 lantern-installer-beta-64-bit.deb
root@master:~#

远程查看Hadoop

我们可以在Windows上的网页中浏览Hadoop的,在网页浏览器中输入网址:

http://192.168.13.138:8088               
//网址和yarn-site.xml配置文件中的对应,IP地址是你master的网址:
<property>
        <name>yarn.resourcemanager.webapp.address</name>
        <value>master:8088</value>
</property>

小结

至此,一个Hadoop2.6.4的集群搭建算是成功了,遇到了不少问题,也解决了不少问题,比以前刚接触Hadoop的时候快多了,那时候配置东西,基本都是按照网上人家的配置选项稀里糊涂的就完成了,现在是看着其他人的博文,想着自己的博文怎么配置,慢慢的寻找配置的本质和内在联系,以保证把每个知识点都搞会,这样才能更好地理解Hadoop集群。
文档下载地址

Ubuntu server 14.04下安装hadoop-2.6.4 集群机

标签:

原文地址:http://blog.csdn.net/u014716068/article/details/51882101

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!