码迷,mamicode.com
首页 > 其他好文 > 详细

(原创)hadoop 分布式开发环境搭建

时间:2014-12-24 16:06:00      阅读:162      评论:0      收藏:0      [点我收藏+]

标签:

一,安装java环境
添加java环境变量
vi /etc/profile
 
# add by tank
export JAVA_HOME=/data/soft/jdk/jdk1.7.0_71
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
 
 
二,修改文件句柄数
vi  /etc/security/limits.conf
 
# add by tank
* soft nofile 65536
* hard nofile  65536
 
 
三,设置ssh无密码登录
 
  参考:http://www.cnblogs.com/tankaixiong/p/4172942.html
 任意主机之间可以无密码登录。
 authorized_keys包含了所有主机的密钥,多主机这里可以通过nfs 挂载同步文件authorized_keys,一改全改
 
 
四,设置HSOT
  vi /etc/hosts
127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.183.130 tank1
192.168.183.131 tank2
192.168.183.132 tank3
192.168.183.133 tank4
 
五,安装hadoop 环境
 
这里用的是hadoop2.20版本
目录结构:
技术分享
设置环境变量:

export HADOOP_HOME=/data/hadoop/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
 
注意:$HADOOP/bin 和$HADOOP/sbin 目录下的文件都有可执行的权限
 
修改配置文件:
[tank@192 hadoop]$ vi core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
      <property>
        <name>hadoop.tmp.dir</name>
        <value>/usr/hadoop/tmp</value><description>(备注:请先在 /usr/hadoop 目录下建立 tmp 文件夹)A base for other temporary directories.</description>
    </property>
  <property>
     <name>fs.default.name</name>
     <value>hdfs://192.168.149.128:9000</value>
  </property>
</configuration>
 备注:如没有配置hadoop.tmp.dir参数,此时系统默认的临时目录为:/tmp/hadoo-hadoop。而这个目录在每次重启后都会被干掉,必须重新执行format才行,否则会出错。
[tank@192 hadoop]$ vi hdfs-site.xml
<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
  <property>
      <name>dfs.namenode.name.dir</name>
      <value>file:/data/soft/hadoop/hadoop-2.2.0/hdfs/name</value>
        <final>true</final>
   </property>
   <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/data/soft/hadoop/hadoop-2.2.0/hdfs/data</value>
   </property>
</configuration>

 

 
文件必须已经预先创建好并存在!
 
[tank@192 hadoop]$ vi mapred-site.xml
<configuration>
   <property>
     <name>mapred.job.tracker</name>
     <value>192.168.149.128:9001</value>
   </property>
</configuration>

 

 
 
注意上面一定要填Ip,不要填localhost,不然eclipse会连接不到!
 
设置主从关系$HADOOP_HOME/etc/hadoop/目录下:
[hadoop@tank1 hadoop]$ vi masters 
192.168.183.130


//主机特有,从机可以不需要
[hadoop@tank1 hadoop]$ vi slaves

192.168.183.131
192.168.183.132
192.168.183.133
 
[hadoop@tank1 hadoop]$ hadoop namenode -format   //第一次需要
 
启动:
sbin/start-all.sh
 
查看状态:主机
[tank@192 hadoop-2.2.0]$ jps
2751 ResourceManager
2628 SecondaryNameNode
2469 NameNode

查看状态:子主机
[hadoop@tank2 sbin]$ jps
1745 NodeManager
1658 DataNode
 

 
总共有5个hadoop线程 
 
访问地址查看hdfs 的运行状态:
http://192.168.149.128:50070/dfshealth.jsp
 
 
 
 

(原创)hadoop 分布式开发环境搭建

标签:

原文地址:http://www.cnblogs.com/tankaixiong/p/4182560.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!