码迷,mamicode.com
首页 > 其他好文 > 详细

Spark Word Count

时间:2018-10-09 00:50:17      阅读:226      评论:0      收藏:0      [点我收藏+]

标签:blog   drive   .so   sort   spl   .text   executor   mit   submit   

import org.apache.spark.{SparkConf, SparkContext}

object WordCount {

  def main(args:Array[String]): Unit = {
    val conf = new SparkConf().setAppName("WordCount")
    val sc = new SparkContext(conf)
    val lines = sc.textFile(args(0))
    val wordCount = lines.flatMap(_.split(" ")).map(x => (x,1)).reduceByKey(_ + _)
    val wordSort = wordCount.map(x => (x._2,x._1)).sortByKey(false).map(x => (x._2,x._1))
    wordSort.saveAsTextFile(args(1))
    sc.stop()
  }

}

spark-submit --class WordCount \
> --master yarn-cluster \
> --num-executors 10 \
> --executor-memory 6G \
> --executor-cores 4 \
> --driver-memory 1G \
> /tmp/spark_practice/sparkPrj.jar \
> /tmp/spark_practice/ghEmployees.txt \
> /tmp/spark_practice/output

submit 参数配置 https://www.cnblogs.com/haoyy/p/6893943.html

Spark Word Count

标签:blog   drive   .so   sort   spl   .text   executor   mit   submit   

原文地址:https://www.cnblogs.com/xdlaoliu/p/9757734.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!