Hadoop的安装
1.修改主机名并重启
a) 执行命令 vim /etc/sysconfig/network
分别将里面的主机名改为linux01,linux02,linux03保存退出
b)执行 vim /etc/ssh/sshd_config
将里面的以下内容注释
RSAAuthentication yes
PubkeyAuthentication yes
AuthorizedKeysFile .ssh/authorized_keys
c) 修改好了就用重启ssh服务
命令为service sshd restart
2.修改 /etc/hosts
对应得内容为下面的格式
Linux01的ip linux01
Linux02的ip linux02
Linux03 的ip linux03
3. ssh免密码登录
4. 安装jdk,配置环境,检测是不是安装成功
在/usr/local 下面建立Java文件夹,将下载好的jdk1.8版本的拉到这个文件夹里面,然后进行解压,
命令为 tar -zxvf jdk的版本名字
配置Java的环境变量
在 Vim ~/.bashrc 里面配置
export JAVA_HOME=/usr/local/java/jdk1.8.0_101
export JRE_HOME=${JAVA_HOME}/jre
export HADOOP_HOME=/usr/local/hadoop/hadoop-2.6.4
export ZOOKEEPER_HOME=/usr/local/zookeeper/zookeeper-3.4.8
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
export SCALA_HOME=/usr/local/scala/scala-2.10.4/
export SPARK_HOME=/usr/local/spark/spark-1.6.0-bin-hadoop2.6
#export FLINK_HOME=/usr/local/flink/flink-0.9.0
export HIVE_HOME=/usr/local/hive/apache-hive-1.2.1-bin
export HIVE_CONF_DIR=${HIVE_HOME}/conf
export HBASE_HOME=/usr/local/hbase/hbase-1.2.5
export MAVEN_HOME=/usr/local/spark/apache-maven-3.3.9
export ANT_HOME=/usr/local/spark/apache-ant-1.9.6
export FLUME_HOME=/usr/local/flume/apache-flume-1.7.0-bin
export FLUME_CONF_DIR=${FLUME_HOME}/conf
export KAFKA_HOME=/usr/local/kafka/kafka_2.10-0.9.0.1
export CLASS_PATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib:${HIVE_HOME}/lib
export PATH=/usr/local/eclipse/eclipse:/usr/local/git/libexec/git-core:/usr/local/ssl/bin:/usr/local/curl/bin:/usr/local/git/bin:/usr/local/spark/git-1.8.2.3/spark/sbt:/usr/local/idea-IC-162.1121.32/bin:${ANT_HOME}/bin:${MAVEN_HOME}/bin:${JAVA_HOME}/bin:${SPARK_HOME}/bin:${SPARK_HOME}/sbin:${SCALA_HOME}/bin:${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:${HIVE_HOME}/bin:${HBASE_HOME}/bin:${FLINK_HOME}/bin:${ZOOKEEPER_HOME}/bin:${FLUME_HOME}/bin:${KAFKA_HOME}/bin:$PATH
然后执行 source ~/.bashrc 让配置文件生效
执行Java -version查看jdk版本信息
5.安装Hadoop
a) 在、usr/local目录下新建一个Hadoop文件夹,将Hadoop的下载包放到里面,进行解压,tar -zxvf Hadoop安装的版本
b) 配置文件,文件位置在cd /usr/local/Hadoop/Hadoop版本/etc/下面
具体的哪些需要配置,如下
- 1. 配置文件slaves 内容改为
linux01
linux02
linux03
- 2. 配置yarn-env.sh 内容添加的是配置Java环境部分
# some Java parameters
export HADOOP_PREFIX=/usr/local/hadoop/hadoop-2.6.4
export HADOOP_COMMON_HOME=${HADOOP_PREFIX}
export HADOOP_HDFS_HOME=${HADOOP_PREFIX}
export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin
export HADOOP_MAPRED_HOME=${HADOOP_PREFIX}
export YARN_HOME=${HADOOP_PREFIX}
export HADOOP_CONF_HOME=${HADOOP_PREFIX}/etc/hadoop
export YARN_CONF_DIR=${HADOOP_PREFIX}/etc/hadoop
export JAVA_HOME=/usr/local/java/jdk1.8.0_101
3.配置yarn-site.xml 内容修改为
<property>
<name>yarn.resourcemanager.hostname</name>
<value>linux1</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
- 3. 配置mapred-site.xml 内容修改为
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>mapreduce.job.tracker</name>
<value>hdfs://Master:9001</value>
<final>true</final>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>768</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx512M</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>1036</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Xmx512M</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>512</value>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>100</value>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>50</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>file:/usr/local/hadoop/workspace/mapred/system</value>
<final>true</final>
</property>
<property>
<name>mapred.local.dir</name>
<value>file:/usr/local/hadoop/workspace/mapred/local</value>
<final>true</final>
</property>
- 4. 配置hdfs-site.xml 内容修改为
<property>
<name>dfs.replication</name>
<value>3</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/usr/local/hadoop/workspace/hdfs/data</value>
<finale>true</finale>
</property>
<property>
<name>dfs.namenode.dir</name>
<value>/usr/local/hadoop/workspace/hdfs/name</value>
</property>
<property>
<name>dfs.datanode.dir</name>
<value>/usr/local/hadoop/workspace/hdfs/data</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name></name>
<value></value>
</property>
- 5. 配置hadoop-env.sh 内容修改为
在这里配置java-home部分
- 6. 配置core-site.xml
<property>
<name>io.native.lib.avaliable</name>
<value>true</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://Master:9000</value>
<final>true</final>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/workspace/tmp</value>
</property>
c) 格式化HDFS 命令为 Hadoop namenode -format
关闭防火墙 命令为service iptables stop
d) 分发文件到linux02和linux03
命令为scp /usr/local/Hadoop root@linux02:/usr/local
e)执行start-all.sh 后用jps查看进程