Mahout小案例学习,实现k-means算法。
环境:OS:Centos 6.5 x64 & Soft:Hadoop 1.2.1 & Mahout 0.9
1、下载测试数据
[huser@master hadoop]$ wget http://archive.ics.uci.edu/ml/databases/synthetic_control/synthetic_control.data
2、数据拷贝到HDFS
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -mkdir
./testdata
Warning: $HADOOP_HOME is deprecated.
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -put
./synthetic_control.data ./testdata
Warning: $HADOOP_HOME is deprecated.
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./testdata
Warning:
$HADOOP_HOME is deprecated.
Found 1 items
-rw-r--r-- 1 huser supergroup
288374 2014-04-17 14:02 /user/huser/testdata/synthetic_control.data
3、做一个kmeans聚类测试
[huser@master hadoop]$ mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
4、观察输出
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls ./output
Warning:
$HADOOP_HOME is deprecated.
Found 15 items
-rw-r--r-- 1 huser supergroup 194 2014-04-17 14:18
/user/huser/output/_policy
drwxr-xr-x - huser supergroup 0
2014-04-17 14:19 /user/huser/output/clusteredPoints
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:10
/user/huser/output/clusters-0
drwxr-xr-x - huser supergroup 0
2014-04-17 14:13 /user/huser/output/clusters-1
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:18
/user/huser/output/clusters-10-final
drwxr-xr-x - huser supergroup
0 2014-04-17 14:14 /user/huser/output/clusters-2
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:14
/user/huser/output/clusters-3
drwxr-xr-x - huser supergroup 0
2014-04-17 14:15 /user/huser/output/clusters-4
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:15
/user/huser/output/clusters-5
drwxr-xr-x - huser supergroup 0
2014-04-17 14:16 /user/huser/output/clusters-6
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:17
/user/huser/output/clusters-7
drwxr-xr-x - huser supergroup 0
2014-04-17 14:17 /user/huser/output/clusters-8
drwxr-xr-x - huser
supergroup 0 2014-04-17 14:18
/user/huser/output/clusters-9
drwxr-xr-x - huser supergroup 0
2014-04-17 14:10 /user/huser/output/data
drwxr-xr-x - huser supergroup
0 2014-04-17 14:10 /user/huser/output/random-seeds
[huser@master hadoop]$ hadoop-1.2.1/bin/hadoop fs -ls
./output/data
Warning: $HADOOP_HOME is deprecated.
Found 3 items
-rw-r--r-- 1 huser supergroup 0 2014-04-17 14:10
/user/huser/output/data/_SUCCESS
drwxr-xr-x - huser supergroup 0
2014-04-17 14:07 /user/huser/output/data/_logs
-rw-r--r-- 1 huser
supergroup 335470 2014-04-17 14:10
/user/huser/output/data/part-m-00000
原文地址:http://www.cnblogs.com/guarder/p/3705357.html