标签:
原文地址:http://blog.csdn.net/wanmeilingdu/article/details/51447290
一、基础环境准备
a. 配置好各个机器的IP地址等,我这边是三台虚拟机,列表如下:
master 192.168.149.131 centos6.5 x64 slave1 192.168.149.132 centos6.5 x64 slave2 192.168.149.133 centos6.5 x64
192.168.149.131 master 192.168.149.132 slave1 192.168.149.133 slave2c. 在每台机器上关闭Linux系统防火墙
JAVA_HOME=/usr/java/jdk1.7.0_67/ PATH=$PATH:$HOME/bin:$PATH:$JAVA_HOME/bin export PATH
source ~/.bash_profile
ssh-keygen -t rsa #生成公钥私钥,需要敲四下回车键 ssh-copy-id -i slave1 #将本机的公钥复制到远程机器的authorized_keys文件中,需要输入yes确认和slave1的登录密码 ssh-copy-id -i slave2 #将本机的公钥复制到远程机器的authorized_keys文件中,需要输入yes确认和slave2的登录密码
验证配置是否成功:
[root@master ~]# ssh slave1 Last login: Thu May 19 04:36:01 2016 from 192.168.149.1 [root@slave1 ~]# exit logout Connection to slave1 closed.
tar -zxvf zookeeper-3.4.6.tar.gz
mv zoo.sample.cfg zoo.cfg
c. 配置zoo.cfg
# The number of milliseconds of each tick tickTime=2000 # The number of ticks that the initial # synchronization phase can take initLimit=10 # The number of ticks that can pass between # sending a request and getting an acknowledgement syncLimit=5 # the directory where the snapshot is stored. # do not use /tmp for storage, /tmp here is just # example sakes. dataDir=/opt/soft/zookeeper-3.4.6/data # the port at which the clients will connect clientPort=2181 # the maximum number of client connections. # increase this if you need to handle more clients #maxClientCnxns=60 # # Be sure to read the maintenance section of the # administrator guide before turning on autopurge. # # http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance # # The number of snapshots to retain in dataDir #autopurge.snapRetainCount=3 # Purge task interval in hours # Set to "0" to disable auto purge feature #autopurge.purgeInterval=1 server.1=192.168.149.131:2888:3888 server.2=192.168.149.132:2888:3888 server.3=192.168.149.133:2888:3888
scp -r zookeeper-3.4.6/ root@slave1:/opt/soft/ scp -r zookeeper-3.4.6/ root@slave2:/opt/soft/注意要修改data目录中myid的内容为相应的2和3
bin/zkServer.sh start #启动 bin/zkServer.sh status #查看状态
[root@master zookeeper-3.4.6]# bin/zkServer.sh status JMX enabled by default Using config: /opt/soft/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower [root@master zookeeper-3.4.6]#
[root@slave1 zookeeper-3.4.6]# bin/zkServer.sh status JMX enabled by default Using config: /opt/soft/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: leader [root@slave1 zookeeper-3.4.6]#
[root@slave2 zookeeper-3.4.6]# bin/zkServer.sh status JMX enabled by default Using config: /opt/soft/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: follower [root@slave2 zookeeper-3.4.6]#
a. 解压Hadoop-2.7.2.tar.gz,我这里用的是我自己在64机器上编译的,如何编译见Hadoop2.7.2源码在64位Centos Linux下编译 ,编译好的tar包可以在这里下载,也可以在hadoop官网上下载,使用官网的tar包,除了会在启动的时候有个警告(warn),无其他使用问题。
tar -zxvf hadoop-2.7.2.tar.gz rm -rf hadoop-2.7.2/share/doc/ #删除doc文档
vi etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/java/jdk1.7.0_67/c. 配置core-site.xml
vi etc/hadoop/core-site.xml
<!--hdfs集群访问地址--> <property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <!--zookeeper地址--> <property> <name>ha.zookeeper.quorum</name> <value>master:2181,slave1:2181,slave2:2181</value> </property> <!--hadoop数据存放数据,hdfs启动时会自动创建--> <property> <name>hadoop.tmp.dir</name> <value>/data</value> </property>
vi etc/hadoop/hdfs-site.xml在configuration标签中添加如下内容,保存退出
<!--hdfs集群名称,和core-site.xml中的对应--> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <!--hdfs集群的两个namenode节点--> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <!--hdfs集群的两个namenode节点之一的rpc访问地址端口--> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>master:8020</value> </property> <!--hdfs集群的两个namenode节点之一的rpc访问地址端口--> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>slave1:8020</value> </property> <!--hdfs集群的两个namenode节点之一的http访问地址端口--> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>master:50070</value> </property> <!--hdfs集群的两个namenode节点之一的http访问地址端口--> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>slave1:50070</value> </property> <!--hdfs集群journalnode地址--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://master:8485;slave1:8485;slave2:8485/mycluster</value> </property> <!--hdfs集群namenode高可用配置--> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <!--hdfs集群namenode高可用方式--> <property> <name>dfs.ha.fencing.methods</name> <value>sshfence</value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/root/.ssh/id_rsa</value> </property> <!--hdfs集群高可用超时配置--> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> <!--hdfs集群journalnode edits文件存放目录--> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/hadoop/data</value> </property> <!--hdfs集群高可用启用配置--> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property>
mv etc/hadoop/mapred-site.xml.template etc/hadoop/mapred-site.xml vi etc/hadoop/yarn-site.xml在configuration标签中添加如下内容,保存退出
<!--启用YARN--> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <!--YARN集群的id,可以随便写--> <property> <name>yarn.resourcemanager.cluster-id</name> <value>cluster1</value> </property> <!--配置YARN的高可用的两个节点--> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <!--配置YARN的高可用的两个节点之一--> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>master</value> </property> <!--配置YARN的高可用的两个节点之一--> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>slave1</value> </property> <!--配置YARN的节点访问地址--> <property> <name>yarn.resourcemanager.webapp.address.rm1</name> <value>master:8088</value> </property> <!--配置YARN的节点访问地址--> <property> <name>yarn.resourcemanager.webapp.address.rm2</name> <value>slave1:8088</value> </property> <!--配置YARN依赖的zookeeper集群地址--> <property> <name>yarn.resourcemanager.zk-address</name> <value>master:2181,slave1:2181,slave2:2181</value> </property>
vi etc/hadoop/mapred-site.xml
<property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
scp -r hadoop-2.7.2/ root@slave1:/opt/soft/ scp -r hadoop-2.7.2/ root@slave2:/opt/soft/
vi ~/.bash_profile添加完后的内容如下所示:
JAVA_HOME=/usr/java/jdk1.7.0_67/ PATH=$PATH:$HOME/bin:$PATH:$JAVA_HOME/bin HADOOP_HOME=/opt/soft/hadoop-2.7.2/ PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin export PATH
source ~/.bash_profilea. 先决条件:保证zookeeper已经启动…
b. 启动三台journalnode
hadoop-daemon.sh start journalnode
hdfs namenode -formatd. 把刚刚格式化之后的元数据拷贝到另外一个namenode上
scp -r /data/dfs/ root@slave1:/data/
hdfs zkfc -formatZKf. 停止上面节点
stop-all.sh
start-all.shh. 其中一个yarn resourcemananger 需要手动启动
yarn-daemon.sh start resourcemanageri.通过jps命令查看启动的hadoop进程
标签:
原文地址:http://blog.csdn.net/liyongke89/article/details/51513989