标签:
在Task调度相关的两篇文章《Spark源码分析之五:Task调度(一)》与《Spark源码分析之六:Task调度(二)》中,我们大致了解了Task调度相关的主要逻辑,并且在Task调度逻辑的最后,CoarseGrainedSchedulerBackend的内部类DriverEndpoint中的makeOffers()方法的最后,我们通过调用TaskSchedulerImpl的resourceOffers()方法,得到了TaskDescription序列的序列Seq[Seq[TaskDescription]],相关代码如下:
-
- launchTasks(scheduler.resourceOffers(workOffers))
- def resourceOffers(offers: Seq[WorkerOffer]): Seq[Seq[TaskDescription]] = synchronized {
这个TaskDescription很简单,是传递到executor上即将被执行的Task的描述,通常由TaskSetManager的resourceOffer()方法生成。代码如下:
- private[spark] class TaskDescription(
- val taskId: Long,
- val attemptNumber: Int,
- val executorId: String,
- val name: String,
- val index: Int,
- _serializedTask: ByteBuffer)
- extends Serializable {
-
-
-
- private val buffer = new SerializableBuffer(_serializedTask)
-
-
- def serializedTask: ByteBuffer = buffer.value
-
-
- override def toString: String = "TaskDescription(TID=%d, index=%d)".format(taskId, index)
- }
此时,得到Seq[Seq[TaskDescription]],即Task被调度到相应executor上后(仅是逻辑调度,实际上并未分配到executor上执行),接下来要做的,便是真正的将Task分配到指定的executor上去执行,也就是本篇我们将要讲的Task的运行。而这部分的开端,源于上述提到的CoarseGrainedSchedulerBackend的内部类DriverEndpoint中的launchTasks()方法,代码如下:
- private def launchTasks(tasks: Seq[Seq[TaskDescription]]) {
-
-
- for (task <- tasks.flatten) {
-
-
- val serializedTask = ser.serialize(task)
-
-
-
- if (serializedTask.limit >= akkaFrameSize - AkkaUtils.reservedSizeBytes) {
-
-
- scheduler.taskIdToTaskSetManager.get(task.taskId).foreach { taskSetMgr =>
- try {
- var msg = "Serialized task %s:%d was %d bytes, which exceeds max allowed: " +
- "spark.akka.frameSize (%d bytes) - reserved (%d bytes). Consider increasing " +
- "spark.akka.frameSize or using broadcast variables for large values."
- msg = msg.format(task.taskId, task.index, serializedTask.limit, akkaFrameSize,
- AkkaUtils.reservedSizeBytes)
-
-
- taskSetMgr.abort(msg)
- } catch {
- case e: Exception => logError("Exception in error callback", e)
- }
- }
- }
- else {
-
-
- val executorData = executorDataMap(task.executorId)
-
-
- executorData.freeCores -= scheduler.CPUS_PER_TASK
-
-
- executorData.executorEndpoint.send(LaunchTask(new SerializableBuffer(serializedTask)))
- }
- }
- }
launchTasks的执行逻辑很简单,针对传入的TaskDescription序列,循环每个Task,做以下处理:
1、首先对Task进行序列化,得到serializedTask;
2、针对序列化后的Task:serializedTask,判断其大小:
2.1、序列化后的task的大小达到或超出规定的上限,即框架配置的Akka消息最大大小,减去除序列化task或task结果外,一个Akka消息需要保留的额外大小的值,则根据task的taskId,在TaskSchedulerImpl的taskIdToTaskSetManager中获取对应的TaskSetManager,并调用其abort()方法,标记对应TaskSetManager为失败;
2.2、序列化后的task的大小未达到上限,在规定的大小范围内,则:
2.2.1、从executorDataMap中,根据task.executorId获取executor描述信息executorData;
2.2.2、在executorData中,freeCores做相应减少;
2.2.3、利用executorData中的executorEndpoint,即Driver端executor通讯端点的引用,发送LaunchTask事件,LaunchTask事件中包含序列化后的task,将Task传递到executor中去执行。
接下来,我们重点分析下上述流程。
先说下异常流程,即序列化后Task的大小超过上限时,对TaskSet标记为失败的处理。入口方法为TaskSetManager的abort()方法,代码如下:
- def abort(message: String, exception: Option[Throwable] = None): Unit = sched.synchronized {
-
-
-
- sched.dagScheduler.taskSetFailed(taskSet, message, exception)
-
-
- isZombie = true
-
-
- maybeFinishTaskSet()
- }
abort()方法处理逻辑共分三步:
第一,调用DAGScheduler的taskSetFailed()方法,标记TaskSet运行失败;
第二,标志位isZombie设置为true;
第三,满足一定条件的情况下,将TaskSet标记为Finished。
首先看下DAGScheduler的taskSetFailed()方法,代码如下:
- def taskSetFailed(taskSet: TaskSet, reason: String, exception: Option[Throwable]): Unit = {
- eventProcessLoop.post(TaskSetFailed(taskSet, reason, exception))
- }
和第二篇文章《Spark源码分析之二:Job的调度模型与运行反馈》中Job的调度模型一致,都是依靠事件队列eventProcessLoop来完成事件的调度执行的,这里,我们在事件队列eventProcessLoop中放入了一个TaskSetFailed事件。在DAGScheduler的事件处理调度函数doOnReceive()方法中,明确规定了事件的处理方法,代码如下:
- case TaskSetFailed(taskSet, reason, exception) =>
- dagScheduler.handleTaskSetFailed(taskSet, reason, exception)
下面,我们看下handleTaskSetFailed()这个方法。
- private[scheduler] def handleTaskSetFailed(
- taskSet: TaskSet,
- reason: String,
- exception: Option[Throwable]): Unit = {
-
-
- stageIdToStage.get(taskSet.stageId).foreach { abortStage(_, reason, exception) }
-
-
- submitWaitingStages()
- }
很简单,首先通过taskSet的stageId获取到对应的Stage,针对Stage,循环调用abortStage()方法,终止该Stage,然后调用submitWaitingStages()方法提交等待的Stages。我们先看下abortStage()方法,代码如下:
- private[scheduler] def abortStage(
- failedStage: Stage,
- reason: String,
- exception: Option[Throwable]): Unit = {
-
-
- if (!stageIdToStage.contains(failedStage.id)) {
-
- return
- }
-
-
- val dependentJobs: Seq[ActiveJob] =
- activeJobs.filter(job => stageDependsOn(job.finalStage, failedStage)).toSeq
-
-
- failedStage.latestInfo.completionTime = Some(clock.getTimeMillis())
-
-
- for (job <- dependentJobs) {
- failJobAndIndependentStages(job, s"Job aborted due to stage failure: $reason", exception)
- }
- if (dependentJobs.isEmpty) {
- logInfo("Ignoring failure of " + failedStage + " because all jobs depending on it are done")
- }
- }
这个方法的处理逻辑主要分为四步:
1、如果stageIdToStage中不存在对应的stage,说明stage已经被移除,直接返回,这是对异常情况下的一种特殊处理;
2、遍历activeJobs中的ActiveJob,逐个调用stageDependsOn()方法,找出存在failedStage的祖先stage的activeJob,即dependentJobs;
3、标记failedStage的完成时间completionTime;
4、遍历dependentJobs,调用failJobAndIndependentStages()。
其它都好说,我们主要看下stageDependsOn()和failJobAndIndependentStages()这两个方法。首先看下stageDependsOn()方法,代码如下:
-
- private def stageDependsOn(stage: Stage, target: Stage): Boolean = {
-
-
- if (stage == target) {
- return true
- }
-
-
- val visitedRdds = new HashSet[RDD[_]]
-
-
-
-
- val waitingForVisit = new Stack[RDD[_]]
-
-
- def visit(rdd: RDD[_]) {
-
- if (!visitedRdds(rdd)) {
-
- visitedRdds += rdd
-
-
- for (dep <- rdd.dependencies) {
- dep match {
-
- case shufDep: ShuffleDependency[_, _, _] =>
-
-
- val mapStage = getShuffleMapStage(shufDep, stage.firstJobId)
- if (!mapStage.isAvailable) {
- waitingForVisit.push(mapStage.rdd)
- }
-
- case narrowDep: NarrowDependency[_] =>
- waitingForVisit.push(narrowDep.rdd)
- }
- }
- }
- }
-
-
- waitingForVisit.push(stage.rdd)
-
-
- while (waitingForVisit.nonEmpty) {
- visit(waitingForVisit.pop())
- }
-
-
- visitedRdds.contains(target.rdd)
- }
这个方法主要是判断参数stage是否为参数target的祖先stage,其代码风格与stage划分和提交中的部分代码一样,这在前面的两篇文章中也提到过,在此不再赘述。而它主要是通过stage的rdd,并遍历其上层依赖的rdd链,将每个stage的rdd加入到visitedRdds中,最后根据visitedRdds中是否存在target的rdd判断参数stage的祖先是否为target。值得一提的是,如果RDD的依赖是NarrowDependency,直接将其压入waitingForVisit,如果为ShuffleDependency,则需要判断stage的isAvailable,如果为false,则将对应RDD压入waitingForVisit。关于isAvailable,我在《Spark源码分析之四:Stage提交》一文中具体阐述过,这里不再赘述。
接下来,我们再看下failJobAndIndependentStages()方法,这个方法的主要作用就是使得一个Job和仅被该Job使用的所有stages失败,并清空有关状态。代码如下:
-
- private def failJobAndIndependentStages(
- job: ActiveJob,
- failureReason: String,
- exception: Option[Throwable] = None): Unit = {
-
-
- val error = new SparkException(failureReason, exception.getOrElse(null))
-
-
- var ableToCancelStages = true
-
-
- val shouldInterruptThread =
- if (job.properties == null) false
- else job.properties.getProperty(SparkContext.SPARK_JOB_INTERRUPT_ON_CANCEL, "false").toBoolean
-
-
-
-
-
- val stages = jobIdToStageIds(job.jobId)
-
-
- if (stages.isEmpty) {
- logError("No stages registered for job " + job.jobId)
- }
-
-
- stages.foreach { stageId =>
-
-
- val jobsForStage: Option[HashSet[Int]] = stageIdToStage.get(stageId).map(_.jobIds)
-
-
- if (jobsForStage.isEmpty || !jobsForStage.get.contains(job.jobId)) {
- logError(
- "Job %d not registered for stage %d even though that stage was registered for the job"
- .format(job.jobId, stageId))
- } else if (jobsForStage.get.size == 1) {
-
- if (!stageIdToStage.contains(stageId)) {
- logError(s"Missing Stage for stage with id $stageId")
- } else {
-
-
- val stage = stageIdToStage(stageId)
- if (runningStages.contains(stage)) {
- try {
-
-
- taskScheduler.cancelTasks(stageId, shouldInterruptThread)
-
-
- markStageAsFinished(stage, Some(failureReason))
- } catch {
- case e: UnsupportedOperationException =>
- logInfo(s"Could not cancel tasks for stage $stageId", e)
- ableToCancelStages = false
- }
- }
- }
- }
- }
-
- if (ableToCancelStages) {
-
-
- job.listener.jobFailed(error)
-
-
- cleanupStateForJobAndIndependentStages(job)
-
-
- listenerBus.post(SparkListenerJobEnd(job.jobId, clock.getTimeMillis(), JobFailed(error)))
- }
- }
处理过程还是很简单的,读者可以通过上述源码和注释自行补脑,这里就先略过了。
下面,再说下正常情况下,即序列化后Task大小未超过上限时,LaunchTask事件的发送及executor端的响应。代码再跳转到CoarseGrainedSchedulerBackend的内部类DriverEndpoint中的launchTasks()方法。正常情况下处理流程主要分为三大部分:
1、从executorDataMap中,根据task.executorId获取executor描述信息executorData;
2、在executorData中,freeCores做相应减少;
3、利用executorData中的executorEndpoint,即Driver端executor通讯端点的引用,发送LaunchTask事件,LaunchTask事件中包含序列化后的task,将Task传递到executor中去执行。
我们重点看下第3步,利用Driver端持有的executor描述信息executorData中的executorEndpoint,即Driver端executor通讯端点的引用,发送LaunchTask事件给executor,将Task传递到executor中去执行。那么executor中是如何接收LaunchTask事件的呢?答案就在CoarseGrainedExecutorBackend中。
我们先说下这个CoarseGrainedExecutorBackend,类的定义如下所示:
- private[spark] class CoarseGrainedExecutorBackend(
- override val rpcEnv: RpcEnv,
- driverUrl: String,
- executorId: String,
- hostPort: String,
- cores: Int,
- userClassPath: Seq[URL],
- env: SparkEnv)
- extends ThreadSafeRpcEndpoint with ExecutorBackend with Logging {
由上面的代码我们可以知道,它实现了ThreadSafeRpcEndpoint和ExecutorBackend两个trait,而ExecutorBackend的定义如下:
- private[spark] trait ExecutorBackend {
-
-
-
- def statusUpdate(taskId: Long, state: TaskState, data: ByteBuffer)
- }
那么它自然就有两种主要的任务,第一,作为endpoint提供driver与executor间的通讯功能;第二,提供了executor任务执行时状态汇报的功能。
CoarseGrainedExecutorBackend到底是什么呢?这里我们先不深究,留到以后分析,你只要知道它是Executor的一个后台辅助进程,和Executor是一对一的关系,向Executor提供了与Driver通讯、任务执行时状态汇报两个基本功能即可。
接下来,我们看下CoarseGrainedExecutorBackend是如何处理LaunchTask事件的。做为RpcEndpoint,在其处理各类事件或消息的receive()方法中,定义如下:
- case LaunchTask(data) =>
- if (executor == null) {
- logError("Received LaunchTask command but executor was null")
- System.exit(1)
- } else {
-
-
- val taskDesc = ser.deserialize[TaskDescription](data.value)
- logInfo("Got assigned task " + taskDesc.taskId)
-
-
- executor.launchTask(this, taskId = taskDesc.taskId, attemptNumber = taskDesc.attemptNumber,
- taskDesc.name, taskDesc.serializedTask)
- }
首先,会判断对应的executor是否为空,为空的话,记录错误日志并退出,不为空的话,则按照如下流程处理:
1、反序列话task,得到taskDesc;
2、调用executor的launchTask()方法加载task。
那么,重点就落在了Executor的launchTask()方法中,代码如下:
- def launchTask(
- context: ExecutorBackend,
- taskId: Long,
- attemptNumber: Int,
- taskName: String,
- serializedTask: ByteBuffer): Unit = {
-
-
- val tr = new TaskRunner(context, taskId = taskId, attemptNumber = attemptNumber, taskName,
- serializedTask)
-
-
- runningTasks.put(taskId, tr)
-
-
- threadPool.execute(tr)
- }
非常简单,创建一个TaskRunner对象,然后将taskId与TaskRunner的对应关系存入runningTasks,将TaskRunner扔到线程池中去执行即可。
我们先看下这个TaskRunner类。我们先看下Class及其成员变量的定义,如下:
- class TaskRunner(
- execBackend: ExecutorBackend,
- val taskId: Long,
- val attemptNumber: Int,
- taskName: String,
- serializedTask: ByteBuffer)
- extends Runnable {
-
-
-
-
-
- @volatile private var killed = false
-
-
- @volatile var startGCTime: Long = _
-
-
- @volatile var task: Task[Any] = _
- }
由类的定义我们可以看出,TaskRunner继承了Runnable,所以它本质上是一个线程,故其可以被放到线程池中去运行。它所包含的成员变量,主要有以下几个:
1、execBackend:Executor后台辅助进程,提供了与Driver通讯、状态汇报等两大基本功能,实际上传入的是CoarseGrainedExecutorBackend实例;
2、taskId:Task的唯一标识;
3、attemptNumber:Task运行的序列号,Spark与MapReduce一样,可以为拖后腿任务启动备份任务,即推测执行原理,如此,就需要通过taskId加attemptNumber来唯一标识一个Task运行实例;
4、serializedTask:ByteBuffer类型,序列化后的Task,包含的是Task的内容,通过发序列化它来得到Task,并运行其中的run()方法来执行Task;
5、killed:Task是否被杀死的标志位;
6、task:Task[Any]类型,需要运行的Task,它将在反序列化来自driver的task二进制数据时在run()方法被设置,一旦被设置,它将不会再发生改变;
7、startGCTime:JVM在task开始运行后,进行垃圾回收的时间。
另外,既然是一个线程,TaskRunner必须得提供run()方法,该run()方法就是TaskRunner线程在线程池中被调度时,需要执行的方法,我们来看下它的定义:
- override def run(): Unit = {
-
-
-
-
- val taskMemoryManager = new TaskMemoryManager(env.memoryManager, taskId)
-
-
- val deserializeStartTime = System.currentTimeMillis()
-
-
- Thread.currentThread.setContextClassLoader(replClassLoader)
-
-
- val ser = env.closureSerializer.newInstance()
- logInfo(s"Running $taskName (TID $taskId)")
-
-
- execBackend.statusUpdate(taskId, TaskState.RUNNING, EMPTY_BYTE_BUFFER)
- var taskStart: Long = 0
-
-
- startGCTime = computeTotalGcTime()
-
- try {
-
- val (taskFiles, taskJars, taskBytes) = Task.deserializeWithDependencies(serializedTask)
- updateDependencies(taskFiles, taskJars)
-
-
- task = ser.deserialize[Task[Any]](taskBytes, Thread.currentThread.getContextClassLoader)
-
-
- task.setTaskMemoryManager(taskMemoryManager)
-
-
-
-
- if (killed) {
-
-
-
-
- throw new TaskKilledException
- }
-
- logDebug("Task " + taskId + "‘s epoch is " + task.epoch)
-
- env.mapOutputTracker.updateEpoch(task.epoch)
-
-
-
-
-
-
-
- taskStart = System.currentTimeMillis()
-
-
- var threwException = true
-
-
- val (value, accumUpdates) = try {
-
-
- val res = task.run(
- taskAttemptId = taskId,
- attemptNumber = attemptNumber,
- metricsSystem = env.metricsSystem)
-
-
- threwException = false
-
-
-
- res
- } finally {
-
-
- val freedMemory = taskMemoryManager.cleanUpAllAllocatedMemory()
- if (freedMemory > 0) {
- val errMsg = s"Managed memory leak detected; size = $freedMemory bytes, TID = $taskId"
- if (conf.getBoolean("spark.unsafe.exceptionOnMemoryLeak", false) && !threwException) {
- throw new SparkException(errMsg)
- } else {
- logError(errMsg)
- }
- }
- }
-
-
- val taskFinish = System.currentTimeMillis()
-
-
-
- if (task.killed) {
- throw new TaskKilledException
- }
-
-
-
-
- val resultSer = env.serializer.newInstance()
-
-
- val beforeSerialization = System.currentTimeMillis()
-
-
- val valueBytes = resultSer.serialize(value)
-
-
- val afterSerialization = System.currentTimeMillis()
-
-
- for (m <- task.metrics) {
-
-
- m.setExecutorDeserializeTime(
- (taskStart - deserializeStartTime) + task.executorDeserializeTime)
-
- m.setExecutorRunTime((taskFinish - taskStart) - task.executorDeserializeTime)
- m.setJvmGCTime(computeTotalGcTime() - startGCTime)
- m.setResultSerializationTime(afterSerialization - beforeSerialization)
- m.updateAccumulators()
- }
-
-
- val directResult = new DirectTaskResult(valueBytes, accumUpdates, task.metrics.orNull)
-
-
- val serializedDirectResult = ser.serialize(directResult)
-
-
- val resultSize = serializedDirectResult.limit
-
-
-
- val serializedResult: ByteBuffer = {
-
-
-
- if (maxResultSize > 0 && resultSize > maxResultSize) {
- logWarning(s"Finished $taskName (TID $taskId). Result is larger than maxResultSize " +
- s"(${Utils.bytesToString(resultSize)} > ${Utils.bytesToString(maxResultSize)}), " +
- s"dropping it.")
- ser.serialize(new IndirectTaskResult[Any](TaskResultBlockId(taskId), resultSize))
- }
-
-
- else if (resultSize >= akkaFrameSize - AkkaUtils.reservedSizeBytes) {
-
- val blockId = TaskResultBlockId(taskId)
- env.blockManager.putBytes(
- blockId, serializedDirectResult, StorageLevel.MEMORY_AND_DISK_SER)
- logInfo(
- s"Finished $taskName (TID $taskId). $resultSize bytes result sent via BlockManager)")
- ser.serialize(new IndirectTaskResult[Any](blockId, resultSize))
- }
-
- else {
- logInfo(s"Finished $taskName (TID $taskId). $resultSize bytes result sent to driver")
- serializedDirectResult
- }
- }
-
-
- execBackend.statusUpdate(taskId, TaskState.FINISHED, serializedResult)
-
- } catch {
-
- case ffe: FetchFailedException =>
- val reason = ffe.toTaskEndReason
- execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
-
- case _: TaskKilledException | _: InterruptedException if task.killed =>
- logInfo(s"Executor killed $taskName (TID $taskId)")
- execBackend.statusUpdate(taskId, TaskState.KILLED, ser.serialize(TaskKilled))
-
- case cDE: CommitDeniedException =>
- val reason = cDE.toTaskEndReason
- execBackend.statusUpdate(taskId, TaskState.FAILED, ser.serialize(reason))
-
- case t: Throwable =>
-
-
-
- logError(s"Exception in $taskName (TID $taskId)", t)
-
- val metrics: Option[TaskMetrics] = Option(task).flatMap { task =>
- task.metrics.map { m =>
- m.setExecutorRunTime(System.currentTimeMillis() - taskStart)
- m.setJvmGCTime(computeTotalGcTime() - startGCTime)
- m.updateAccumulators()
- m
- }
- }
- val serializedTaskEndReason = {
- try {
- ser.serialize(new ExceptionFailure(t, metrics))
- } catch {
- case _: NotSerializableException =>
-
- ser.serialize(new ExceptionFailure(t, metrics, false))
- }
- }
-
-
- execBackend.statusUpdate(taskId, TaskState.FAILED, serializedTaskEndReason)
-
-
-
- if (Utils.isFatalError(t)) {
- SparkUncaughtExceptionHandler.uncaughtException(t)
- }
-
- } finally {
-
-
- runningTasks.remove(taskId)
- }
- }
如此长的一个方法,好长好大,哈哈!不过,纵观全篇,无非三个Step就可搞定:
1、Step1:Task及其运行时需要的辅助对象构造;
2、Step2:Task运行;
3、Step3:Task运行结果处理。
对, 就这么简单!鉴于时间与篇幅问题,我们这里先讲下主要流程,细节方面的东西留待下节继续。
下面,我们一个个Step来看,首先看下Step1:Task及其运行时需要的辅助对象构造,主要包括以下步骤:
1.1、构造TaskMemoryManager任务内存管理器,即taskMemoryManager;
1.2、记录反序列化开始时间;
1.3、当前线程设置上下文类加载器;
1.4、从SparkEnv中获取序列化器ser;
1.5、execBackend更新状态TaskState.RUNNING;
1.6、计算垃圾回收时间;
1.7、调用Task的deserializeWithDependencies()方法,反序列化Task,得到Task运行需要的文件taskFiles、jar包taskFiles和Task二进制数据taskBytes;
1.8、反序列化Task二进制数据taskBytes,得到task实例;
1.9、设置Task的任务内存管理器;
1.10、如果此时Task被kill,抛出异常,快速退出;
接下来,是Step2:Task运行,主要流程如下:
2.1、获取task开始时间;
2.2、标志位threwException设置为true,标识Task真正执行过程中是否抛出异常;
2.3、调用Task的run()方法,真正执行Task,并获得运行结果value,和累加器更新accumUpdates;
2.4、标志位threwException设置为false;
2.5、通过任务内存管理器taskMemoryManager清理所有的分配的内存;
2.6、获取task完成时间;
2.7、如果task被杀死,抛出TaskKilledException异常。
最后一步,Step3:Task运行结果处理,大体流程如下:
3.1、通过SparkEnv获取Task运行结果序列化器;
3.2、获取结果序列化前的时间点;
3.3、利用Task运行结果序列化器序列化Task运行结果value,得到valueBytes;
3.4、获取结果序列化后的时间点;
3.5、度量指标体系相关,暂不介绍;
3.6、构造DirectTaskResult,同时包含Task运行结果valueBytes和累加器更新值accumulator updates;
3.7、序列化DirectTaskResult,得到serializedDirectResult;
3.8、获取Task运行结果大小;
3.9、处理Task运行结果:
3.9.1、如果Task运行结果大小大于所有Task运行结果的最大大小,序列化IndirectTaskResult,IndirectTaskResult为存储在Worker上BlockManager中DirectTaskResult的一个引用;
3.9.2、如果 Task运行结果大小超过Akka除去需要保留的字节外最大大小,则将结果写入BlockManager,Task运行结果比较小的话,直接返回,通过消息传递;
3.9.3、Task运行结果比较小的话,直接返回,通过消息传递
3.10、execBackend更新状态TaskState.FINISHED;
最后,无论运行成功还是失败,将task从runningTasks中移除。
至此,Task的运行主体流程已经介绍完毕,剩余的部分细节,包括Task内run()方法的具体执行,还有任务内存管理器、序列化器、累加更新,还有部分异常情况处理,状态汇报等等其他更为详细的内容留到下篇再讲吧!
明天还要工作,洗洗睡了!
博客原地址:http://blog.csdn.net/lipeng_bigdata/article/details/50726216
Spark源码分析之七:Task运行(一)
标签:
原文地址:http://www.cnblogs.com/jirimutu01/p/5274461.html