码迷,mamicode.com
首页 > 其他好文 > 详细

部署Hadoop集群

时间:2015-05-09 20:36:04      阅读:169      评论:0      收藏:0      [点我收藏+]

标签:部署hadoop

1.Hadoop的3种运行模式

    单机模式:安装简单,几乎不用作任何配置,但仅限于调试用途

    伪分布模式:在单节点上同时启动namenode、datanode、jobtracker、tasktracker、secondary namenode等5个进程,模拟分布式运行的各个节点

    完全分布式模式:正常的Hadoop集群,由多个各司其职的节点构成


2.伪分布式模式的安装和配置


  (1)下载并解压Hadoop安装包,我们选用0.20.2版本

可以到官网下载,也可以在国内镜像网站下载,如http://mirror.bit.edu.cn/apache/hadoop/common,下载下来的格式为.tar.gz。然后对文件解压,操作如下:

[liuqingjie@localhost ~]$ cd /home/liuqingjie/

[liuqingjie@localhost ~]$ tar -zxvf hadoop-0.20.2.tar.gz 

  (2)进入Hadoop的解压目录,编辑conf/hadoop-env.sh文件(注意0.23版后配置文件的位置有所化)

[liuqingjie@localhost ~]$ cd /home/liuqingjie/

[liuqingjie@localhost ~]$ cd hadoop-0.20.2

[liuqingjie@localhost hadoop-0.20.2]$ cd conf/

[liuqingjie@localhost conf]$ vi hadoop-env.sh 

对打开的文档找到下面的行

# The java implementation to use.  Required.

# export JAVA_HOME=/usr/lib/j2sdk1.5-sun

然后配置JAVA_HOME为自己的Java安装目录,并删除export前面的#

# The java implementation to use.  Required.

 export JAVA_HOME=/usr/java/jdk1.6.0_45(改成自己的jdk安装路径)

  (3)编辑conf目录下core-site.xml、hdfs-site.xml和mapred-site.xml三个核心配置文件

修改core-site.xml文件

[liuqingjie@localhost conf]$ vi core-site.xml 

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://localhost:9000</value>

</property>

</configuration>

注:fs.default.name:NameNode的ip地址和端口,如果搭建完全分布式环境,localhost要改成真实的ip地址或者服务器网络名称。

修改hdfs-site.xml文件

<configuration>

<property>

<name>dfs.data.dir</name>

<value>/home/liuqingjie/hadoop-0.20.2/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>1</value>

</property>

</configuration>

注:/home/liuqingjie/hadoop-0.20.2/data需要自己建立

修改mapred-site.xml

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>localhost:9001</value>

</property>

</configuration>

  (4)配置ssh,生成密钥,使用ssh可以免密码连接localhost

[liuqingjie@localhost ~]$ su root

Password: 

[root@localhost liuqingjie]# cd

[root@localhost ~]# ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/root/.ssh/id_rsa): 直接回车

Created directory ‘/root/.ssh‘.

Enter passphrase (empty for no passphrase): 直接回车

Enter same passphrase again: 直接回车

Your identification has been saved in /root/.ssh/id_rsa.

Your public key has been saved in /root/.ssh/id_rsa.pub.

The key fingerprint is:

e0:42:02:fa:a8:9e:2d:bb:10:6e:56:39:4f:5d:d5:bb root@localhost.localdomain

The key‘s randomart image is:

+--[ RSA 2048]----+

|.          ..    |

|..        .  .   |

|. . . .  .    .  |

| o o.....    .   |

|o .+....S     .  |

|o.. +.       E   |

|o+   .           |

|=.o              |

| =+.             |

+-----------------+

[root@localhost ~]# cd .ssh/

[root@localhost .ssh]# ls

id_rsa  id_rsa.pub

[root@localhost .ssh]# cp id_rsa.pub  authorized_keys

    (5)格式化HDFS

    (6)使用bin/start-all.sh启动Hadoop

    (7)使用bin/stop-all.sh关闭Hadoop


3.完全分布式模式的安装和配置

现有三台linux虚拟机,ip地址为192.168.132.0-192.168.132.2,要安装完全分布式模式。

(1)配置hosts文件

所有的节点都修改/etc/hosts,使彼此之间都能把主机名解析为ip,同时将/etc/sysconfig/network文件里HOSTNAME改为主机名

[root@localhost ~]# cat /etc/hosts

127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4

::1         localhost localhost.localdomain localhost6 localhost6.localdomain6

192.168.132.130 master

192.168.132.131 slave1

192.168.132.132 slave2

配置好后,依次关闭各虚拟机,依次开启各虚拟机,配置生效

  (2)配置ssh免密码连入

[liuqingjie@master ~]$ ssh-keygen -t rsa

Generating public/private rsa key pair.

Enter file in which to save the key (/home/liuqingjie/.ssh/id_rsa): 回车

Enter passphrase (empty for no passphrase): 回车

Enter same passphrase again: 回车

Your identification has been saved in /home/liuqingjie/.ssh/id_rsa.

Your public key has been saved in /home/liuqingjie/.ssh/id_rsa.pub.

The key fingerprint is:

31:0e:d5:bf:0b:13:b0:07:59:20:1e:bf:5d:1c:11:dc liuqingjie@master

The key‘s randomart image is:

+--[ RSA 2048]----+

|      o o=..++   |

|     . =+ ....E  |

|      o ++ .o    |

|       o.=o..    |

|        S... .   |

|          o .    |

|           o .   |

|            .    |

|                 |

+-----------------+

[liuqingjie@master ~]$ cd .ssh/

[liuqingjie@master .ssh]$ ll

total 8

-rw-------. 1 liuqingjie liuqingjie 1675 May  9 00:11 id_rsa

-rw-r--r--. 1 liuqingjie liuqingjie  399 May  9 00:11 id_rsa.pub

[liuqingjie@master .ssh]$ cp id_rsa.pub authorized_keys

所有节点都要进行此操作,把各个节点的authorized_keys的内容互相拷贝到对方的文件中,然后就可以彼此之间免密码ssh连入。合并后如下。

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAr/xJO01MMjhnc4C1Jr+SKhhFbaFJQ+DS25dxZjeNdn93liQVCI7NZZJYhKnJK6Lb3egZdAUCXciTJRU8SPOhL/7+vFJehaZYcxFIUMB3grdd52QDduVkIv5gyvzLPGhzqeu7wZLXcobE9p6WZmgc/OQAuMyZCo/mCWpflNc/zg2f1UZJ8tVRGkX8aPFQzdtxKxmL+t5MV+yGb/yABGlxHxs2I3aI+eaXzvdDZhjiAC6odTmVlK9hSoF0oOyNYRDc2U7lk6WwBkdGbEMqRwQKjxwrqvP7eTUb5yMbZnnEaZVlU5s/M6AzZxPobLU1bg90gg/AfLpZTqAGhXDC05TIyQ== liuqingjie@master

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAswMa0wgYssomzZku71WcdlP8arb+OQKA7AgF13MpA1+U0HFGbmho9akY0ArzQW7YvnobJXAGqmP/ylr+DnBoD4dCimecmc1kR1X8QU4oKcOXmBLKG3qUNJO1x5HR2nzXCVcgj38DiZBOf1HBCxbm2hkNizNXWrqUO3DuPLgKL6+2cmUMN2tuTzNW2FdIZYyNFZNtN8bAKtKI48wyfBA3h24jq7J46E2gHfoUjDC1MGXMTVKR6nZEwD7WoQtcw/fTtBzSD5zotMd170kMNjjuokHiW9Btrd+Y2iZrvs4CPS0121pfBsrk6kh6K0gZowj6AQqn+tN5yT/O/CSEie7wFw== liuqingjie@slave1

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAp9A6hCdJZMC12Mvf81vVILwiWCm/CET75KZ+apGt0mhC4cUTRH+Yo0FW5bb3c7+bSLjbeT+x4U8Sqt48T6M2Kf5Qs6xXT9cAZ+fvgpQNhMxUbRKt2WWkW27OgyYGK3zkE5+iOh4hOEmbS5++G+T4VI0srxTvj8YaR0BWwBLQQ6Wli27t6H8tzEVvsNhujA3by+5xUg4M7CbnMJVf0G3MGl/Aey1D8RZXxCTvAfL8roVwl18SyGfgAQQDhObTvidHawZ/dN/w0XrcVPnJ69zl1MYfs4mSUf4Qe8GHvhq2XBiDyPgb97CEVV0QtyFp1BeT0Kc4ZBjpvIn9SdilCUZfeQ== liuqingjie@slave2

(3)下载并解压hadoop安装包

 只需先在master上安装。

(4)配置namenode(master)

修改hadoop-env.sh(同伪分布式)

修改core-site.xml文件

[liuqingjie@master conf]$ vi core-site.xml 

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://master:9000</value>

</property>

</configuration>

注:fs.default.name:NameNode的ip地址和端口,master也可以为ip地址。

修改hdfs-site.xml文件

<configuration>

<property>

<name>dfs.data.dir</name>

<value>/home/liuqingjie/hadoop-0.20.2/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

</configuration>

注:/home/liuqingjie/hadoop-0.20.2/data需要自己建立,2表示2个datanode节点

修改mapred-site.xml

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>master:9001</value>

</property>

</configuration>

修改masters和slaves文件

[liuqingjie@master conf]$ vi masters 

master

[liuqingjie@master conf]$ vi slaves 

slave1

slave2

(5)向各节点复制hadoop

[liuqingjie@master ~]$ scp -r ./hadoop-0.20.2 slave1:/home/liuqingjie/

[liuqingjie@master ~]$ scp -r ./hadoop-0.20.2 slave2:/home/liuqingjie/

(6)格式化namenode

[liuqingjie@master hadoop-0.20.2]$ bin/hadoop namenode -format

15/05/09 01:18:25 INFO namenode.NameNode: STARTUP_MSG: 

/************************************************************

STARTUP_MSG: Starting NameNode

STARTUP_MSG:   host = master/192.168.132.130

STARTUP_MSG:   args = [-format]

STARTUP_MSG:   version = 0.20.2

STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-0.20 -r 911707; compiled by ‘chrisdo‘ on Fri Feb 19 08:07:34 UTC 2010

************************************************************/

15/05/09 01:18:25 INFO namenode.FSNamesystem: fsOwner=liuqingjie,liuqingjie

15/05/09 01:18:25 INFO namenode.FSNamesystem: supergroup=supergroup

15/05/09 01:18:25 INFO namenode.FSNamesystem: isPermissionEnabled=true

15/05/09 01:18:26 INFO common.Storage: Image file of size 100 saved in 0 seconds.

15/05/09 01:18:26 INFO common.Storage: Storage directory /tmp/hadoop-liuqingjie/dfs/name has been successfully formatted.

15/05/09 01:18:26 INFO namenode.NameNode: SHUTDOWN_MSG: 

/************************************************************

SHUTDOWN_MSG: Shutting down NameNode at master/192.168.132.13

(7)启动hadoop

启动之前要关闭各台机器的防火墙,否则报错。

[root@master hadoop-0.20.2]# service iptables stop

iptables: Setting chains to policy ACCEPT: filter          [  OK  ]

iptables: Flushing firewall rules:                         [  OK  ]

iptables: Unloading modules:                               [  OK  ]

[liuqingjie@master hadoop-0.20.2]$ bin/start-all.sh 

starting namenode, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-namenode-master.out

slave1: starting datanode, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-datanode-slave1.out

slave2: starting datanode, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-datanode-slave2.out

master: starting secondarynamenode, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-secondarynamenode-master.out

starting jobtracker, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-jobtracker-master.out

slave2: starting tasktracker, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-tasktracker-slave2.out

slave1: starting tasktracker, logging to /home/liuqingjie/hadoop-0.20.2/bin/../logs/hadoop-liuqingjie-tasktracker-slave1.out

用jps检验各后台进程是否成功启动

[liuqingjie@master hadoop-0.20.2]$ jps

3336 NameNode

3622 Jps

3549 JobTracker

3481 SecondaryNameNode

[liuqingjie@slave1 ~]$ jps

3069 Jps

2996 TaskTracker

2905 DataNode

[root@slave2 hadoop-0.20.2]# jps

2896 DataNode

2982 TaskTracker

3046 Jps


部署Hadoop集群

标签:部署hadoop

原文地址:http://4649608.blog.51cto.com/4639608/1649842

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!