码迷,mamicode.com
首页 > 其他好文 > 详细

hadoop环境搭建

时间:2017-11-17 14:55:56      阅读:166      评论:0      收藏:0      [点我收藏+]

标签:sed   ide   环境   log   database   project   condition   named   安装配置   

配置Linux静态IP

1. 配置 vi /etc/sysconfig/network-scripts/ifcfg-ens33
[root@192 ~]#  vi /etc/sysconfig/network-scripts/ifcfg-ens33

TYPE="Ethernet"
BOOTPROTO="static" #静态IP
DEFROUTE="yes"
PEERDNS="yes"
PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV6INIT="yes"
IPV6_AUTOCONF="yes"
IPV6_DEFROUTE="yes"
IPV6_PEERDNS="yes"
IPV6_PEERROUTES="yes"
IPV6_FAILURE_FATAL="no"
IPV6_ADDR_GEN_MODE="stable-privacy"
NAME="ens33"
UUID="a95ef74c-c9df-4d72-bae5-5820a50e6228"
DEVICE="ens33"
ONBOOT="yes"
IPADDR=192.168.95.128  #IP地址
GATEWAY=192.168.95.2  #网关
NETMASK=255.255.255.0  #子网掩码
DNS1=180.76.76.76   #DNS
2. 测试是否联网
[root@192 ~]# ping www.baidu.com
PING www.baidu.com (14.215.177.38) 56(84) bytes of data.
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=1 ttl=128 time=10.8 ms
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=2 ttl=128 time=10.7 ms
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=3 ttl=128 time=10.2 ms
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=4 ttl=128 time=9.48 ms
64 bytes from 14.215.177.38 (14.215.177.38): icmp_seq=5 ttl=128 time=9.93 ms
^C
--- www.baidu.com ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4008ms
rtt min/avg/max/mdev = 9.485/10.249/10.850/0.512 ms

修改主机名

1. hostname
[root@192 ~]# hostname
192.168.95.128
[root@192 ~]# set-hostname hadoop-senior01.ibeifeng.com #修改主机名
2. windows本机host配置

C:\Windows\System32\drivers\etc\hosts

## hadoop-senior
192.168.95.128   hadoop-senior01.ibeifeng.com hadoop-senior01
3.配置网络映射
[root@192 ~]# vi /etc/hosts

# 127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
# ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.95.128   hadoop-senior01.ibeifeng.com hadoop-senior01
4.重启

init 6

创建普通用户

[root@hadoop-senior01 ~]# useradd beifeng
[root@hadoop-senior01 ~]# echo ‘123456‘ | passwd --stdin beifeng
[root@hadoop-senior01 ~]# su - beifeng

配置超级权限

[root@hadoop-senior01 ~]# visudo
## Allow root to run any commands anywhere

root    ALL=(ALL)       ALL    #找到这行
beifeng ALL=(ALL)       ALL    #添加一行

方法2

su -
echo ‘beifeng ALL=(ALL) ALL‘ >> /etc/sudoers

搭建环境

1.规划目录

[beifeng@hadoop-senior01 opt]$ sudo rm -rf ./* #删掉opt下所有目录
[beifeng@hadoop-senior01 opt]$  sudo mkdir software #建立需要目录
[beifeng@hadoop-senior01 opt]$  sudo mkdir modules
[beifeng@hadoop-senior01 opt]$  sudo mkdir datas
[beifeng@hadoop-senior01 opt]$  sudo mkdir tools
[beifeng@hadoop-senior01 opt]$ ll
total 0
drwxr-xr-x. 2 root root 6 Nov 10 07:25 datas
drwxr-xr-x. 2 root root 6 Nov 10 07:25 modules
drwxr-xr-x. 2 root root 6 Nov 10 07:24 software
drwxr-xr-x. 2 root root 6 Nov 10 07:25 tools
[beifeng@hadoop-senior01 opt]$  sudo chown -R beifeng:beifeng * #改变所有者
[beifeng@hadoop-senior01 opt]$ ll
total 0
drwxr-xr-x. 2 beifeng beifeng 6 Nov 10 07:25 datas
drwxr-xr-x. 2 beifeng beifeng 6 Nov 10 07:25 modules
drwxr-xr-x. 2 beifeng beifeng 6 Nov 10 07:24 software
drwxr-xr-x. 2 beifeng beifeng 6 Nov 10 07:25 tools

2. 安装rz工具(上传本机文件)

[beifeng@hadoop-senior01 opt]$ sudo yum -y install lrzsz
[beifeng@hadoop-senior01 ~]$ cd /opt/software/ #到software
[beifeng@hadoop-senior01 software]$ rz #上传本机文件到此目录

3. 解压文件

[beifeng@hadoop-senior01 software]$ tar -zxf jdk-7u67-linux-x64.tar.gz -C /opt/modules/
[root@hadoop-senior01 software]# tar -zxf hadoop-2.5.0.tar.gz -C /opt/modules/

4. 配置JAVA环境变量

[beifeng@hadoop-senior01 ~]$ sudo vi /etc/profile
#JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_144
export PATH=$JAVA_HOME/bin:$PATH

方法2

su -
echo ‘JAVA_HOME
export JAVA_HOME=/opt/modules/jdk1.8.0_144
export PATH=$JAVA_HOME/bin:$PATH‘ >> /etc/profile
[beifeng@hadoop-senior01 jdk1.7.0_67]$ su -
[root@hadoop-senior01 ~]# source /etc/profile #刷新
[root@hadoop-senior01 ~]# exit
[beifeng@hadoop-senior01 ~]$ java -version #查看版本
java version "1.7.0_67"
Java(TM) SE Runtime Environment (build 1.7.0_67-b01)
Java HotSpot(TM) 64-Bit Server VM (build 24.65-b04, mixed mode)

5. 删除doc文档(里面是英文文档,没有作用)

[beifeng@hadoop-senior01 hadoop-2.5.0]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        38G  7.1G   31G  19% /
devtmpfs        474M     0  474M   0% /dev
tmpfs           489M   84K  489M   1% /dev/shm
tmpfs           489M  7.2M  482M   2% /run
tmpfs           489M     0  489M   0% /sys/fs/cgroup
/dev/sda1       297M  152M  146M  51% /boot
tmpfs            98M   16K   98M   1% /run/user/42
tmpfs            98M     0   98M   0% /run/user/0
[beifeng@hadoop-senior01 share]$ rm -rf doc/
[beifeng@hadoop-senior01 share]$ df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda3        38G  5.6G   33G  15% /
devtmpfs        474M     0  474M   0% /dev
tmpfs           489M   84K  489M   1% /dev/shm
tmpfs           489M  7.2M  482M   2% /run
tmpfs           489M     0  489M   0% /sys/fs/cgroup
/dev/sda1       297M  152M  146M  51% /boot
tmpfs            98M   16K   98M   1% /run/user/42
tmpfs            98M     0   98M   0% /run/user/0

查看文件路径

[root@hadoop-senior01 etc]# pwd 显示当前目录
/opt/modules/hadoop-2.5.0/etc
[root@hadoop-senior01 hadoop-2.5.0]# ls | sed "s:^:`pwd`/: " #显示所有文件夹路径
/opt/modules/hadoop-2.5.0/bin
/opt/modules/hadoop-2.5.0/etc
/opt/modules/hadoop-2.5.0/include
/opt/modules/hadoop-2.5.0/lib
/opt/modules/hadoop-2.5.0/libexec
/opt/modules/hadoop-2.5.0/sbin
/opt/modules/hadoop-2.5.0/share
[root@hadoop-senior01 modules]# find /opt/modules/hadoop-2.5.0/etc/hadoop/ #find 用于查看当前目录文件绝对路径
/opt/modules/hadoop-2.5.0/etc/hadoop/
/opt/modules/hadoop-2.5.0/etc/hadoop/capacity-scheduler.xml
/opt/modules/hadoop-2.5.0/etc/hadoop/configuration.xsl
/opt/modules/hadoop-2.5.0/etc/hadoop/container-executor.cfg
/opt/modules/hadoop-2.5.0/etc/hadoop/core-site.xml
/opt/modules/hadoop-2.5.0/etc/hadoop/hadoop-env.cmd
/opt/modules/hadoop-2.5.0/etc/hadoop/hadoop-env.sh
/opt/modules/hadoop-2.5.0/etc/hadoop/hadoop-metrics.properties
/opt/modules/hadoop-2.5.0/etc/hadoop/hadoop-metrics2.properties
/opt/modules/hadoop-2.5.0/etc/hadoop/hadoop-policy.xml
/opt/modules/hadoop-2.5.0/etc/hadoop/hdfs-site.xml
/opt/modules/hadoop-2.5.0/etc/hadoop/httpfs-env.sh
/opt/modules/hadoop-2.5.0/etc/hadoop/httpfs-log4j.properties
/opt/modules/hadoop-2.5.0/etc/hadoop/httpfs-signature.secret
/opt/modules/hadoop-2.5.0/etc/hadoop/httpfs-site.xml
/opt/modules/hadoop-2.5.0/etc/hadoop/log4j.properties
/opt/modules/hadoop-2.5.0/etc/hadoop/mapred-env.cmd
/opt/modules/hadoop-2.5.0/etc/hadoop/mapred-env.sh
/opt/modules/hadoop-2.5.0/etc/hadoop/mapred-queues.xml.template
/opt/modules/hadoop-2.5.0/etc/hadoop/mapred-site.xml.template
/opt/modules/hadoop-2.5.0/etc/hadoop/slaves
/opt/modules/hadoop-2.5.0/etc/hadoop/ssl-client.xml.example
/opt/modules/hadoop-2.5.0/etc/hadoop/ssl-server.xml.example
/opt/modules/hadoop-2.5.0/etc/hadoop/yarn-env.cmd
/opt/modules/hadoop-2.5.0/etc/hadoop/yarn-env.sh
/opt/modules/hadoop-2.5.0/etc/hadoop/yarn-site.xml

配置HDFS、启动及测试读写文件

1.设置JAVA的安装目录

说明:对Hadoop 、YARN、MapReduce模块进行JAVA安装配置
etc/hadoop/hadoop-env.sh
etc/hadoop/mapred-env.sh
etc/hadoop/yarn-env.sh

[root@hadoop-senior01 ~]# echo ${JAVA_HOME} #查看位置
/opt/modules/jdk1.8.0_144
# The java implementation to use.
export JAVA_HOME=${JAVA_HOME} #替换成安装路径
export JAVA_HOME=/opt/modules/jdk1.8.0_144
[root@hadoop-senior01 hadoop-2.5.0]# bin/hadoop #显示hadoop脚本的使用文档
Usage: hadoop [--config confdir] COMMAND
       where COMMAND is one of:
  fs                   run a generic filesystem user client
  version              print the version
  jar <jar>            run a jar file
  checknative [-a|-h]  check native hadoop and compression libraries availability
  distcp <srcurl> <desturl> copy file or directories recursively
  archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
  classpath            prints the class path needed to get the
                       Hadoop jar and the required libraries
  daemonlog            get/set the log level for each daemon
 or
  CLASSNAME            run the class named CLASSNAME

2.配置HDFS相关xml文件属性

core-site.xml
说明:配置主节点NameNode位置及交互端口
fs.defaultFS表示默认文件系统
etc/hadoop/core-site.xml

<property>
    <name>fs.defaultFS</name>
    <value>hdfs://hadoop-senior01.ibeifeng.com:8020</value>
</property>

指定hadoop运行时产生文件的存储路径

[beifeng@hadoop-senior01 hadoop-2.5.0]$ mkdir -p data/tmp #创建临时数据
<property>
    <name>hadoop.tmp.dir</name>
    <value>/opt/modules/hadoop-2.5.0/data/tmp</value>
</property>

slaves

hadoop-senior01.ibeifeng.com

** hdfs-site.xml**
文件块的副本个数,伪分布式只有一个datanode,副本为1

<property>
<name>dfs.replication</name>
<value>1</value>
</property>

3. 格式化HDFS文件系统


[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs namenode -format #格式化

4. 启动HDFS文件系统测试读写文件

[beifeng@hadoop-senior01 hadoop-2.5.0]$ jps
47713 Jps
[beifeng@hadoop-senior01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start namenode
starting namenode, logging to /opt/modules/hadoop-2.5.0/logs/hadoop-beifeng-namenode-hadoop-senior01.ibeifeng.com.out
[beifeng@hadoop-senior01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode
starting datanode, logging to /opt/modules/hadoop-2.5.0/logs/hadoop-beifeng-datanode-hadoop-senior01.ibeifeng.com.out
[beifeng@hadoop-senior01 hadoop-2.5.0]$ jps #验证是否启动成功
47811 DataNode
47875 Jps
47737 NameNode

5. 关闭防火墙登录web管理界面

[root@hadoop-senior01 ~]# firewall-cmd --state #查看防火墙状态
running
[root@hadoop-senior01 ~]# systemctl stop firewalld.service #关闭防火墙
[root@hadoop-senior01 ~]# firewall-cmd --state
not running
[root@hadoop-senior01 ~]# systemctl disable firewalld.service #禁止firewall开机启动
Removed symlink /etc/systemd/system/dbus-org.fedoraproject.FirewallD1.service.
Removed symlink /etc/systemd/system/basic.target.wants/firewalld.service.

centos7之前的版本
service iptables stop #停止
chkconfig iptables off #禁用

HDFS管理界面:http://hadoop-senior01.ibeifeng.com:50070

创建目录
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p temp/conf #创建目录 无/
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir /text
/user/beifeng/temp/conf #web页面
/text #web页面

[beifeng@hadoop-senior01 hadoop-2.5.0]$ sbin/hadoop-daemon.sh start datanode

上传文件

[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -put etc/hadoop/
web页面
/user/beifeng/hadoop

读取文件
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -cat /user/beifeng/hadoop/hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>

</configuration>
下载文件
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -get /user/beifeng/hadoop/hdfs-site.xml /home/beifeng/Downloads
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -get /user/beifeng/hadoop/hdfs-site.xml /home/beifeng/Downloads/get-hdfs-site.xml
[beifeng@hadoop-senior01 ~]$ cd Downloads/
[beifeng@hadoop-senior01 Downloads]$ ls
hdfs-site.xml
[beifeng@hadoop-senior01 Downloads]$ ls
get-hdfs-site.xml  hdfs-site.xml

配置YARN,启动及MapReduce运行在YARN上

1.配置 etc/hadoop/mapred-site.xml:

mapred-site.xml.template 改成mapred-site.xml
指定MapReduce 运行在YARN上

    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>

2.配置 etc/hadoop/yarn-site.xml:

Reduce 获取数据的方式

    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>

指定resourcemanager 的位置

    <property>
        <name>yarn.resourcemanager.hostname</name>
        <value>hadoop-senior01.ibeifeng.com</value>
    </property>

3. 启动yarn

[beifeng@hadoop-senior01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start resourcemanager
[beifeng@hadoop-senior01 hadoop-2.5.0]$ sbin/yarn-daemon.sh start nodemanager
[beifeng@hadoop-senior01 hadoop-2.5.0]$ jps
2690 NameNode
8402 Jps
8309 NodeManager
2749 DataNode
8061 ResourceManager
[beifeng@hadoop-senior01 hadoop-2.5.0]$ sudo find /tmp/ -name ‘*.pid‘ #查找pid文件
/tmp/hadoop-beifeng-namenode.pid
/tmp/hadoop-beifeng-datanode.pid
/tmp/yarn-beifeng-resourcemanager.pid
/tmp/yarn-beifeng-nodemanager.pid

yarn外部访问:8088
HDFS外部访问:50070
http://hadoop-senior01.ibeifeng.com:8088/

在YARN上运行MapReduce WordCount程序

1 .创建目录(输入路径)
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -mkdir -p /user/beifeng/wordcount/input 创建测试
2 .上传待计算文件到input

[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -put /opt/datas/test1.input /user/beifeng/wordcount/input #将本地文件上传

列出本jar能用的命令

[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar 
An example program must be given as the first argument.
Valid program names are:
  aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
  aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
  bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
  dbcount: An example job that count the pageview counts from a database.
  distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
  grep: A map/reduce program that counts the matches of a regex in the input.
  join: A job that effects a join over sorted, equally partitioned datasets
  multifilewc: A job that counts words from several files.
  pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
  pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
  randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
  randomwriter: A map/reduce program that writes 10GB of random data per node.
  secondarysort: An example defining a secondary sort to the reduce.
  sort: A map/reduce program that sorts the data written by the random writer.
  sudoku: A sudoku solver.
  teragen: Generate data for the terasort
  terasort: Run the terasort
  teravalidate: Checking results of terasort
  wordcount: A map/reduce program that counts the words in the input files.
  wordmean: A map/reduce program that counts the average length of the words in the input files.
  wordmedian: A map/reduce program that counts the median length of the words in the input files.
  wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
3.运行程序(输出路径)
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/yarn jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.5.0.jar wordcount /user/beifeng/wordcount/input /user/beifeng/wordcount/output #打jar包-命令-输入-输出 
4.查看运行结果
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/beifeng/wordcount/output/part* #查看结果
[beifeng@hadoop-senior01 hadoop-2.5.0]$ bin/hdfs dfs -text /user/beifeng/wordcount/output/part*

Authentication  1
Authorization   1
Availability    2
Browse  2
Building    1
Built   1
By  1
C   1
CHANGES.txt 3
CLI 1
Cache   2
Capacity    1
Centralized 1
Circuit 1
Cluster 5
Cluster.    2
Commands    2
Common  2
Compatibility   1
Compatibilty    1
Configuration   4
Configure   1
Copy    2
DataNode    1
Deploy  1
Deprecated  1
Dist    1
DistCp  1
Distributed 2
Download    3
Edits   1

开机启动脚本

sbin/hadoop-daemon.sh start datanode
sbin/hadoop-daemon.sh start namenode
sbin/yarn-daemon.sh start nodemanager
sbin/yarn-daemon.sh start resourcemanager
sbin/hadoop-daemon.sh stop datanode
sbin/hadoop-daemon.sh stop namenode
sbin/yarn-daemon.sh stop nodemanager
sbin/yarn-daemon.sh stop resourcemanager

hadoop环境搭建

标签:sed   ide   环境   log   database   project   condition   named   安装配置   

原文地址:http://www.cnblogs.com/xuwei1/p/7851182.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!