配置之前的准备
环境 centos 7.2+
了解hadoop以及后面一系列需要的服务之间的大致关系,以及相互的支持连接所需物件。
1.账户
useradd -m hadoop -s /bin/bash
# -m 完整的用户空间(在/home里自建一个用户文件夹并配置相关文件夹)
# -s 确定shell类型
#关于账户要确定一下账户的名称,最一个集群里的账户名一样,后面不用太操心
2.sudo 添加
[root@master local]# visudo
相比较直接修改 /etc/sudoers 更安全更方便
## Allow root to run any commands anywhere root ALL=(ALL) ALL hadoop ALL=(ALL) ALL
我加的就是hadoop账户 ,
关于权限的操作主要有chown 和chmod
eg:
chown -R hadoop:hadoop /usr/local/hadoop
chmod 755 /usr/local/hadoop
3,.ip/映射
添加在/etc/hosts
master 为主,namenode
slave1,slave2为datanode
[root@master local]# cat /etc/hosts 127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4 ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 192.168.10.120 master 192.168.10.114 slave1 192.168.10.109 slave2
# 左ip 右主机名(不知道这样说准不准确)
4.主机名修改
(1) hostnamectl set-hostname XXXX
# 修改后退出当前终端再开一个就会发下已经更改了
(2) 直接修改 /etc/hostname
5.如果要建hadoop集群,虚拟机需要使用桥接,也可使用双网卡,
6.ssh
每个节点
yum -y install openssh ssh-keygen -t rsa #后直接回车到底 cd ~/.ssh #除了master节点 scp id_rsa.pub hadoop@master:/home/hadoop/id_rsa01.pub scp id_rsa.pub hadoop@master:/home/hadoop/id_rsa02.pub #master: cd ~/.ssh cat id_rsa.pub >>authorized_keys cat id_rsa01.pub >>/home/hadoop/.ssh/authorized_keys cat id_rsa02.pub >>/home/hadoop/.ssh/authorized_keys
#其他节点 cd ~/.ssh scp authorized_keys root@slave1:/home/hadoop/.ssh scp authorized_keys root@slave2:/home/hadoop/.ssh
使用时可能会要输密码,主要是在~/.ssh/known-host中没有要连接的机子,所以要验证一下
7.修改权限
[hadoop@master local]$ chmod 600 ~/.ssh/authorized_keys
8.解压
tar -zxvf xxxxx.tar.gz unzip xxxxx.zip
#此时我修改了有关的包的版本号
#如 mv hadoop-2.7.3 hadoop
mv jdk1.8.0 java
tar -zxvf jdk-8u111-linux-x64.tar.gz -C /usr/local
tar -zxvf hadoop.tar.gz -C /usr/local
#解压到/usr/local
9.环境变量
#/etc/profile 全局变量,普通用户,root都能用
#/home/hadoop/.bashrc 单一变量,只有 文件所在位置 的用户 自身 能用 ,root也不行,就是说,sudo时,会找不到环境变量。这时,只能修改所用文件的权限,或者在/etc/profile中添加环境变量
export JAVA_HOME=/usr/local/java export HADOOP_HOME=/usr/local/hadoop export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin:$JAVA_HOME/bin
#更改后更新source /etc/profile
hadoop2.7.3:https://pan.baidu.com/s/1kXaLrbp 密码:q0ok
jdk1.8.0: https://pan.baidu.com/s/1htxBMJy 密码:lg9m
10.配置
cd hadoop/etc/hadoop/
hadoop-env.sh
# 修改export JAVA_HOME= ${JAVA_HOMR}
export JAVA_HOME=/usr/local/java
slaves
slave1
slave2
#datanode节点名,也能直接用ip
core-site.xml
<configuration> <property> <name>fs.defaultFS</name> <value>hdfs://master:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> </property> </configuration>
hdfs-site.xml
<property> <name>dfs.namenode.secondary.http-address</name> <value>master:50090</value> </property> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property>
mapred-site.xml
mv mapred-site.xml.template mapred-site.xml <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property>
#这也可以不配,不影响启动
#<property> #<name>mapreduce.jobhistory.address</name> #<value>master:10020</value> #</property> #<property> #<name>mapreduce.jobhistory.webapp.address</name> #<value>master:19888</value> #</property>
yarn-site.xml
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
11.hdfs namenode -format
#在master中运行以上命令 主要是检查标签的错误
12.防火墙
在启动集群之前需要关闭centos 的防火墙:
service iptables stop
chkconfig iptables off
关闭ubuntu防火墙 ufw disable
查看防火墙状态 ufw status
scp -r /usr/local/hadoop hadoop@slave1:/usr/local/
scp -r /usr/local/hadoop hadoop@slave2:/usr/local/
这次传一定不行,可以先传到 /home/hadoop/Desktop 再转到/usr/local
还要记得修改所有者 用chown -R
13.查看集群开启状态
hdfs dfsadmin -report
[hadoop@master Desktop]$ hdfs dfsadmin -report Safe mode is ON Configured Capacity: 38002491392 (35.39 GB) Present Capacity: 25061294080 (23.34 GB) DFS Remaining: 25061277696 (23.34 GB) DFS Used: 16384 (16 KB) DFS Used%: 0.00% Under replicated blocks: 0 Blocks with corrupt replicas: 0 Missing blocks: 11 Missing blocks (with replication factor 1): 6 ------------------------------------------------- Live datanodes (2): Name: 192.168.10.114:50010 (slave1) Hostname: slave1 Decommission Status : Normal Configured Capacity: 19001245696 (17.70 GB) DFS Used: 8192 (8 KB) Non DFS Used: 6133903360 (5.71 GB)
14.wordcount
hdfs dfs mkdir /input echo “hello,world.hello,hadoop”>>test1.txt echo “hello,world.hello,hadoop”>>test2.txt hdfs dfs -put /usr/local/test*.txt /onput hadoop jar /usr/local/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.6.5.jar wordcount /input /output
hadoop 官方文档:
http://hadoop.apache.org/docs/r1.0.4/cn/
关于以上言论如果有各种问题,欢迎大家提出修改意见。
---恢复内容结束---