Hadoop中自带的examples之wordcount应用案例

时间：2017-07-28 09:45:30 阅读：172 评论：0 收藏：0 [点我收藏+]

标签：several random mapred min defining written ase wordcount style

大家都知道hadoop中自带了非常多样例。那么怎么用呢，今天主要測试下hadoop中的wordcount程序jar包：

1、首先启动hadoop

2、准备数据：vim words，写入

hello tom

hello jerry

hello kitty

hello tom

hello bbb

3、将数据上传到HDFS

hadoop fs -put words /user/guest/words.txt

4、执行examples中自带的wordcount程序jar包

guest@master:/usr/hadoop/share/hadoop/mapreduce$ hadoop jar hadoop-mapreduce-examples-2.4.0.jar

An example program must be given as the first argument.

Valid program names are:

aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.

aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.

bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.

dbcount: An example job that count the pageview counts from a database.

distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.

grep: A map/reduce program that counts the matches of a regex in the input.

join: A job that effects a join over sorted, equally partitioned datasets

multifilewc: A job that counts words from several files.

pentomino: A map/reduce tile laying program to find solutions to pentomino problems.

pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.

randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.

randomwriter: A map/reduce program that writes 10GB of random data per node.

secondarysort: An example defining a secondary sort to the reduce.

sort: A map/reduce program that sorts the data written by the random writer.

sudoku: A sudoku solver.

teragen: Generate data for the terasort

terasort: Run the terasort

teravalidate: Checking results of terasort

wordcount: A map/reduce program that counts the words in the input files.

wordmean: A map/reduce program that counts the average length of the words in the input files.

wordmedian: A map/reduce program that counts the median length of the words in the input files.

wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.

从这里能够看到wordcount程序。然后运行：

hadoop jar hadoop-mapreduce-examples-2.4.0.jar wordcount /user/guest/words.txt /user/guest/wordcount

查看结果：hello 5 Jerry 1 kitty 1 tom 2 bbb 1

Hadoop中自带的examples之wordcount应用案例

标签：several random mapred min defining written ase wordcount style

原文地址：http://www.cnblogs.com/ljbguanli/p/7248355.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行