码迷,mamicode.com
首页 > 其他好文 > 详细

hadoop2.5.2、hbase0.98.7和sqoop1.4.6搭建

时间:2015-11-19 20:59:51      阅读:425      评论:0      收藏:0      [点我收藏+]

标签:

1、前期工作:

(1)准备三台电脑:

  安装Ubuntu14.04(最好用户名都为hadoop,方便后面的文件传输)

  网络映射:

    分别修改三台的主机名(sudo /etc/hostname):分别为master,slave1,slave2并配好IP假设为:ip1,ip2,ip3

    修改网络映射:sudo /etc/hosts

           可以注释掉127.0.1.1一行

           增加ip1 master

             ip2 slave1

                      ip3 slave2

(2)安装openssh-server(sudo apt-get install openssh-server5.5)

   设置三台电脑之间的无密码登录:参考http://www.cnblogs.com/xiaomila-study/p/4971385.html

(3)安装jdk:下载jdk文件,解压并添加环境变量  sudo vim /etc/profile

  #JAVA enviroment setting

  export JAVA_HOME=/usr/lib/jvm/jdk1.7.0_79

  export JRE_HOME=${JAVA_HOME}/jre  

  export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib  

  export PATH=${JAVA_HOME}/bin:$PATH

  退出并 source /etc/profile

2、Hadoop环境的搭建:

(1)下载hadoop2.5.2的tar.gz包并解压

(2)添加环境变量:方法同1中(3)

  #hadoop enviroment
  export HADOOP_HOME=/home/hadoop/my_project/hadoop-2.5.2
  export HADOOP_COMMON_HOME=${HADOOP_HOME}
  export HADOOP_MAPRED_HOME=${HADOOP_HOME}
  export HADOOP_YARN_HOME=${HADOOP_HOME}
  export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
  export PATH=${HADOOP_HOME}/bin:$PATH
  export PATH=${HADOOP_HOME}/sbin:$PATH

(3)修改配置文件内容:目录为hadoop的安装目录下/etc/hadoop的文件,分别为hadoop.env.sh(如果是hadoop.env.template.sh,改为hadoop.env.sh),core.site.xml,maped.site.xml,yarn.site.xml,yarn.env.sh,hdfs.site.xml、slaves

  hadoop.env.sh:  

    # The java implementation to use.
    export JAVA_HOME=/home/hadoop/my_project/jdk1.7.0_79(jdk的安装目录)

  core.site.xml

<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://ip1:9000</value>
</property>

<!--property>
<name>fs.default.name</name>
<value>hdfs://master:9000</value>
</property-->

<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>

<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<description>Abase for other temporary directories.</description>
</property>

<property>
<name>hadoop.proxyuser.hadoop.hosts</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>

</configuration>

  maped.site.xml

<configuration>
<property>
<name>mapred.job.tracker</name>
<value>ip1:9001</value>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>

<!--property>
<name>mapreduce.jobhistory.address</name>
<value>ip1:10020</value>
</property>

<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>ip1:19888</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value> -Xmx4096m</value>
</property>
<property>
<name>mapreduce.admin.map.child.java.opts</name>
<value>-XX:-UseGCOverheadLimit</value>
</property-->
</configuration>

  hdfs.site.xml

<configuration>
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>master:9001</value>
</property>

<property>
<name>dfs.namenode.name.dir</name>
<value>/home/hadoop/dfs/name</value>
</property>

<property>
<name>dfs.datanode.data.dir</name>
<value>/home/hadoop/dfs/data</value>
</property>

<property>
<name>dfs.replication</name>
<value>3</value>
</property>

<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>

<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>8192</value>
</property>
</configuration>

yarn.site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>ip:8032</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>ip:8030</value>
</property>
<property>
<name>yarn.resourcemanager.resource-tracker.address</name>
<value>ip1:8031</value>
</property>
<property>
<name>yarn.resourcemanager.admin.address</name>
<value>ip1:8033</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address</name>
<value>ip1:8088</value>
</property>
<!-- property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>2.9</value>
</property-->

</configuration>

yarn.env.sh

# some Java parameters
export JAVA_HOME=/home/hadoop/my_project/jdk1.7.0_79

slaves

master

slave1

slave2

(4)格式化节点:hdfs namenode -format

(5)启动hadoop:start-all.sh

(6)运行自带的wordcount:参考http://www.cnblogs.com/xiaomila-study/p/4973662.html

运行成功则hadoop安装成功,否则需要在调试参数,即第(3)步的各文件

3、hbase安装:

(1)下载hbase0.98.7的tar.gz文件并解压

(2)配置环境变量:  

  #HBASE enviroment
  export HBASE_HOME=/home/hadoop/my_project/hbase-0.98.7
  export PATH=${HBASE_HOME}/bin:$PATH

(3)配置文件:在hbase的安装目录conf文件夹下:分别是hbase.env.sh和hbase.site.xml、regionservers

hbase.env.sh

export JAVA_HOME=/home/hadoop/my_project/jdk1.7.0_79

export HBASE_MANAGES_ZK=true(表示使用hbase自带的zookeeper)

hbase.site.xml

<configuration>
<property>
<name>hbase.rootdir</name>
<value>hdfs://ip1:9000/hbase</value>
</property>
<property>
<name>hbase.master</name>
<value>hdfs://ip1:60000</value>
</property>
<property>
<name>hbase.cluster.distributed</name>
<value>true</value>
</property>
<property>
<name>hbase.zookeeper.property.clientPort</name>
<value>2222</value>
</property>
<property>
<name>hbase.zookeeper.quorum</name>
<value>ip1,ip2,ip3</value>
</property>
<property>
<name>hbase.zookeeper.property.dataDir</name>
<value>/home/hadoop/zookeeper</value>
</property>
<property>
<name>hbase.regionserver.handler.count</name>
<value>100</value>
</property>
<property>
<name>zookeeper.session.timeout</name>
<value>90000</value>
</property>
<property>
<name>hbase.regionserver.restart.on.zk.expire</name>
<value>true</value>
<description>
Zookeeper session expired will force regionserver exit.
Enable this will make the regionserver restart.
</description>
</property>
</configuration>

regionservers

master
slave1
slave2

(3)启动hbase:start-hbase.sh

(4)进入hbase:hbase shell

(5)验证是否成功:list

          create ‘test‘,‘info‘

如果创建成功,则hbase安装成功

4、sqoop安装:

(1)下载sqoop1.4.6的tar.gz包并解压

(2)添加环境变量:

#sqoop enviroment
export SQOOP_HOME=/home/hadoop/my_project/sqoop-1.4.6
export PATH=${SQOOP_HOME}/bin:$PATH

(3)修改配置文件:sqoop.env.sh

#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME=/home/hadoop/my_project/hadoop-2.5.2

#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME=/home/hadoop/my_project/hadoop-2.5.2

config-sqoop(bin目录下):将下面的注释掉,这里如果hbase_home如果没有注释掉不知道有没有用,我的环境是注释了的

## Moved to be a runtime check in sqoop.
#if [ ! -d "${HBASE_HOME}" ]; then
# echo "Warning: $HBASE_HOME does not exist! HBase imports will fail."
# echo ‘Please set $HBASE_HOME to the root of your HBase installation.‘
#fi

## Moved to be a runtime check in sqoop.
#if [ ! -d "${HCAT_HOME}" ]; then
# echo "Warning: $HCAT_HOME does not exist! HCatalog jobs will fail."
# echo ‘Please set $HCAT_HOME to the root of your HCatalog installation.‘
#fi

#if [ ! -d "${ACCUMULO_HOME}" ]; then
# echo "Warning: $ACCUMULO_HOME does not exist! Accumulo imports will fail."
# echo ‘Please set $ACCUMULO_HOME to the root of your Accumulo installation.‘
#fi
#if [ ! -d "${ZOOKEEPER_HOME}" ]; then
# echo "Warning: $ZOOKEEPER_HOME does not exist! Accumulo imports will fail."
# echo ‘Please set $ZOOKEEPER_HOME to the root of your Zookeeper installation.‘
#fi

(4)将jdbc的jar包放入sqoop安装路径的lib文件夹下:mysql-connector-java-5.1.32-bin.jar、sqljdbc4.jar、

sqoop-sqlserver-1.0.jar

(5)如是测试mysql,则在mysql服务器端设置:grant all privileges on *.* to ‘root‘@‘%‘ identified by ‘123‘ with grant option;(这样可以远程访问),同理若是sqlserver可以设置远程访问

(6)测试:

测试sqoop连接mysql:
sqoop list-databases --connect jdbc:mysql://ip:3306/ --username root  --password 123

sqoop import --connect ‘jdbc:sqlserver://ip;username=sa;password=123;database=WebHotPub‘   --query ‘select * from channelType where $CONDITIONS‘  --split-by channelType.chnTypeID  --hbase-create-table --hbase-table chnType1 --column-family channelInfo --hbase-row-key chnTypeID -m 3

sqoop import --connect ‘jdbc:sqlserver://ip;username=sa;password=123;database=WebHotPub‘ --table channelType  --hbase-create-table --hbase-table chnType --column-family channelInfo --hbase-row-key chnTypeID -m 1

hadoop2.5.2、hbase0.98.7和sqoop1.4.6搭建

标签:

原文地址:http://www.cnblogs.com/xiaomila-study/p/4978906.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!