码迷,mamicode.com
首页 > 编程语言 > 详细

java -jar运行spark程序找不到自己写的类的错误解决

时间:2014-11-23 23:24:10      阅读:1353      评论:0      收藏:0      [点我收藏+]

标签:spark scala

错误信息:

.....

14/11/23 06:04:10 ERROR TaskSetManager: Task 2.0:1 failed 4 times; aborting job
14/11/23 06:04:10 INFO DAGScheduler: Failed to run sortByKey at Main.scala:29
Exception in thread "main" org.apache.spark.SparkException: Job aborted: Task 2.0:1 failed 4 times (most recent failure: Exception failure: java.lang.ClassNotFoundException: youling.studio.Main$$anonfun$2)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/11/23 06:04:10 INFO TaskSetManager: Loss was due to java.lang.ClassNotFoundException: youling.studio.Main$$anonfun$2 [duplicate 7]

....

其中的youling.studio.Main是我自己写的类

原因是:work上找不到此类

val conf = new SparkConf()
    conf.setMaster("spark://single:8081")
      .setSparkHome("/cloud/spark-0.9.1-bin-hadoop2")
      .setAppName("word count")
      .setJars(jars) //这行没有写,加上就好了
      .set("spark.executor.memory","200m")

全部代码:

package youling.studio


import org.apache.spark.SparkContext._
import org.apache.spark.{SparkConf, SparkContext}


import scala.collection.mutable.ListBuffer


/**
 * Created by Administrator on 2014/11/23.
 */
object Main {
  def main (args: Array[String]) {
    if(args.length!=3) {
      println("CMD:java -jar *.jar input output")
      System.exit(0)
    }


    val jars = ListBuffer[String]()
    args(0).split(‘,‘).map(jars += _)


    val conf = new SparkConf()
    conf.setMaster("spark://single:8081")
      .setSparkHome("/cloud/spark-0.9.1-bin-hadoop2")
      .setAppName("word count")
      .setJars(jars)
      .set("spark.executor.memory","200m")


    val sc = new SparkContext(conf)
    val data = sc.textFile(args(1))


    data.cache


    println(data.count)


    data.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).map(x=>(x._2,x._1)).sortByKey(false).map(x=>(x._2,x._1)).saveAsTextFile(args(2))


  }
}

java -jar运行spark程序找不到自己写的类的错误解决

标签:spark scala

原文地址:http://blog.csdn.net/u010670689/article/details/41420727

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!