码迷,mamicode.com
首页 > 系统相关 > 详细

第一个hadoop程序(hadoop2.4.0集群+Eclipse环境)

时间:2015-04-07 13:29:08      阅读:424      评论:0      收藏:0      [点我收藏+]

标签:

一、Eclipse hadoop环境配置 

1. 在我的电脑右键->属性->高级系统设置->环境变量,配置环境变量:

        JAVA_HOME=D:\ProgramFiles\Java\jdk1.7.0_67

      HADOOP_HOME=D:\TEDP_Software\hadoop-2.4.0,

      PATH=.;%JAVA_HOME%\bin;%HADOOP_HOME%\bin;

2. 在Eclipse中安装好hadoop-eclipse-kepler-plugin-2.2.0.jar插件,并配置好Hadoop Server

二、WordCount程序

1.准备测试文件
[hadoop@master hadoop]# mkdir file 

[hadoop@master hadoop]# cd file

[hadoop@master file]# ls
[hadoop@master file]# echo "Hello world">file1.txt
[hadoop@master file]# echo"Hello hadoop">file2.txt

2. 输入文件夹
创建Hadoop文件夹: hadoop fs -mkdir /user
权限设置:hadoop fs -chmod -R 777 /user
创建输入文件夹: hadoop fs -mkdir /user/input
查看文件夹: hadoop fs -ls /
上传文件到Hadoop: hadoop fs -put ~/file/file*.txt /user/input
报错1:
java.net.NoRouteToHostException: No route to host
(或在hive中:could only be replicated to 0 nodes instead of minReplication (=1).  There are 2 datanode(s) running and 2 node(s) are excluded in this operation.)
防火墙没关闭导致的:各主机切换到root, 执行 service iptables stop 
 
3. 新建MR工程,将附件中WordCount.java拷贝进去
WordCount类上右键->Run as->Run Configurations,输入如下参数信息:
hdfs://192.168.1.200:9000/user/input hdfs://192.168.1.200:9000/user/output
技术分享
 
4.Run on hadoop
(1)异常信息1:Exception in thread "main" java.lang.NullPointerException
解决办法: 百度上说,这是Hadoopwindows上的一个BUG,在linux上没有问题
下载hadoop-common-2.2.0-bin-master.zip技术分享解压后将

bin中的文件替换到.\hadoop-2.4.0\bin

并将bin中的hadoop.dll拷贝到C:\Windows\System32中,重启电脑。

(2)异常信息2:14/12/02 21:01:01 ERROR util.Shell: Failed to locate the winutils binary in the hadoop binary path

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

解决办法:   配置本地环境变量: HADOOP_HOME =D:\Soft\Linux\hadoop-2.4.0需重启,

不想重启的话在代码中加:  System.setProperty("hadoop.home.dir", "D:\\Soft\\Linux\\hadoop-2.4.0"); 
(3)异常信息3:Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory hdfs://192.168.1.200:9000/user/output already exists

解决办法:  output文件夹已存在,修改一下输出文件夹或间output删掉

(4)异常信息4: 然后没反应了(这是后来新建第二个hadoop程序时发生的错误)

技术分享

解决办法:到Run Configurations->main中发现mainclass为jline.ANSIBuffer, 改成WordCount,让后点击“Run”即可

注意:如果用”Run As“ ->“Run On Hadoop”菜单执行,在弹出页面选择Select Type的时候要输入或选择WordCount;

技术分享

5.OK 运行结果:

技术分享Hello 2

hadoop 1

world 1

6. 附件: WordCount .java文件
 
import java.io.IOException;
import java.util.*;
 
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
 
public class WordCount {
 
 public static class Map extends MapReduceBase implements
   Mapper<LongWritable, Text, Text, IntWritable> {
  private final static IntWritable one = new IntWritable(1);
  private Text word = new Text();
 
  public void map(LongWritable key, Text value,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
   String line = value.toString();
   StringTokenizer tokenizer = new StringTokenizer(line);
   while (tokenizer.hasMoreTokens()) {
    word.set(tokenizer.nextToken());
    output.collect(word, one);
   }
  }
 }
 
 public static class Reduce extends MapReduceBase implements
   Reducer<Text, IntWritable, Text, IntWritable> {
  public void reduce(Text key, Iterator<IntWritable> values,
    OutputCollector<Text, IntWritable> output, Reporter reporter)
    throws IOException {
   int sum = 0;
   while (values.hasNext()) {
    sum += values.next().get();
   }
   output.collect(key, new IntWritable(sum));
  }
 }
 
 public static void main(String[] args) throws Exception {
 
 // System.setProperty("hadoop.home.dir", "D:\\Soft\\Linux\\hadoop-2.4.0");
 
  JobConf conf = new JobConf(WordCount.class);
  conf.setJobName("wordcount");
 
  conf.setOutputKeyClass(Text.class);
  conf.setOutputValueClass(IntWritable.class);
 
  conf.setMapperClass(Map.class);
  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);
 
  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);
 
  FileInputFormat.setInputPaths(conf, new Path(args[0]));
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));
 
  JobClient.runJob(conf);
 }
}

本文参考:http://www.cnblogs.com/xia520pi/archive/2012/05/16/2504205.html 

《完》
 
技术分享
技术分享
技术分享
技术分享
技术分享

技术分享

技术分享

第一个hadoop程序(hadoop2.4.0集群+Eclipse环境)

标签:

原文地址:http://www.cnblogs.com/zhaohz/p/4397953.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!