标签:spark scala
错误信息:
.....
14/11/23 06:04:10 ERROR TaskSetManager: Task 2.0:1 failed 4 times; aborting job
14/11/23 06:04:10 INFO DAGScheduler: Failed to run sortByKey at Main.scala:29
Exception in thread "main" org.apache.spark.SparkException: Job aborted: Task 2.0:1 failed 4 times (most recent failure: Exception failure:
java.lang.ClassNotFoundException: youling.studio.Main$$anonfun$2)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1020)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$org$apache$spark$scheduler$DAGScheduler$$abortStage$1.apply(DAGScheduler.scala:1018)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$abortStage(DAGScheduler.scala:1018)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$processEvent$10.apply(DAGScheduler.scala:604)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:604)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498)
at akka.actor.ActorCell.invoke(ActorCell.scala:456)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237)
at akka.dispatch.Mailbox.run(Mailbox.scala:219)
at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386)
at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)
14/11/23 06:04:10 INFO TaskSetManager: Loss was due to java.lang.ClassNotFoundException: youling.studio.Main$$anonfun$2 [duplicate 7]
....
其中的youling.studio.Main是我自己写的类
原因是:work上找不到此类
val conf = new SparkConf()
conf.setMaster("spark://single:8081")
.setSparkHome("/cloud/spark-0.9.1-bin-hadoop2")
.setAppName("word count")
.setJars(jars) //这行没有写,加上就好了
.set("spark.executor.memory","200m")
全部代码:
package youling.studio
import org.apache.spark.SparkContext._
import org.apache.spark.{SparkConf, SparkContext}
import scala.collection.mutable.ListBuffer
/**
* Created by Administrator on 2014/11/23.
*/
object Main {
def main (args: Array[String]) {
if(args.length!=3) {
println("CMD:java -jar *.jar input output")
System.exit(0)
}
val jars = ListBuffer[String]()
args(0).split(‘,‘).map(jars += _)
val conf = new SparkConf()
conf.setMaster("spark://single:8081")
.setSparkHome("/cloud/spark-0.9.1-bin-hadoop2")
.setAppName("word count")
.setJars(jars)
.set("spark.executor.memory","200m")
val sc = new SparkContext(conf)
val data = sc.textFile(args(1))
data.cache
println(data.count)
data.flatMap(_.split(" ")).map((_,1)).reduceByKey(_+_).map(x=>(x._2,x._1)).sortByKey(false).map(x=>(x._2,x._1)).saveAsTextFile(args(2))
}
}
java -jar运行spark程序找不到自己写的类的错误解决
标签:spark scala
原文地址:http://blog.csdn.net/u010670689/article/details/41420727