标签:spark spark core 源码
本节主要讲解SparkContext的逻辑
首先看一个spark自带的最简单的例子:
object SparkPi { def main(args: Array[String]) { val conf = new SparkConf().setAppName("Spark Pi") val spark = new SparkContext(conf) val slices = if (args.length > 0) args(0).toInt else 2 val n = math.min(100000L * slices, Int.MaxValue).toInt // avoid overflow val count = spark.parallelize(1 until n, slices).map { i => val x = random * 2 - 1 val y = random * 2 - 1 if (x*x + y*y < 1) 1 else 0 }.reduce(_ + _) println("Pi is roughly " + 4.0 * count / n) spark.stop() } }我们一般写spark程序的流程与此类似。从这个简单的程序中,逐步分析内部的原理。个人觉得这才是spark最精髓的地方,至于之前的master,worker的启动流程与一般的分布式系统无太多差别。
首先创建SparkConf,加载一些spark的配置信息。
创建SparkContext,在创建SparkContext时可以指定preferredNodeLocationData,也可以不指定。
SparkContext创建的过程比较复杂,我们只介绍比较重要的对象及方法
1、listenerBus中可添加各种SparkListener监听器,当任何SparkListenerEvent事件到来时,向所有注册进来的监听器发送事件// An asynchronous listener bus for Spark events private[spark] val listenerBus = new LiveListenerBus2、persistentRdds用于缓存RDD在内存中// Keeps track of all persisted RDDs private[spark] val persistentRdds = new TimeStampedWeakValueHashMap[Int, RDD[_]]3、创建SparkEnv -> 调用createDriverEnv<pre name="code" class="java" style="font-size: 12pt; background-color: rgb(255, 255, 255);">// Create the Spark execution environment (cache, map output tracker, etc) _env = createSparkEnv(_conf, isLocal, listenerBus)流程:1)创建driver的ActorRef,并包装在rpcEnv中
2)创建mapOutputTracker,实际类型为MapOutputTrackerMaster,用于跟踪map output的信息。并将该对象注册到MapOutputTrackerMasterEndpoint中。说明一下注册的作用:注册返回mapOutputTracker.trackerEndpoint(ActorRef类型),之后向该ActorRef发送消息会回调mapOutputTracker中的相关方法。比如发送AkkaMessage消息,会回调MapOutputTrackerMasterEndpoint的receiveAndReply或者receive方法。
3)创建shuffleManager,默认是org.apache.spark.shuffle.hash.HashShuffleManager
4)创建shuffleMemoryManager
5)创建blockTransferService默认是netty,shuffle时读取块的服务
6)创建blockManagerMaster,负责记录下所有BlockIds存储在哪个Worker上
7)创建blockManager,提供真正的接口用于读写
8)创建cacheManager,它是依赖于blockManager的,RDD在进行计算的时候,通过CacheManager来获取数据,并通过CacheManager来存储计算结果
9)创建
broadcastManager
10)创建
Driver比较简单,spark-submit在提交的时候会指定所要依赖的jar文件从哪里读取;Executor由worker来启动,worker需要下载Executor启动时所需要的jar文件。为了解决Executor启动时依赖的Jar问题,Driver在启动的时候要启动HttpFileServer存储第三方jar包,然后由worker从HttpFileServer来获取。httpFileServer,Driver和Executor在运行的时候都有可能存在第三方包依赖,
11)创建
outputCommitCoordinator
12)创建
executorMemoryManager
将上面的对象共同包装成SparkEnv
4、创建_metadataCleaner,定期清理元数据信息
5、创建executorEnvs,Executor相关的配置
6、_heartbeatReceiver,用于接收Executor的心跳,同时,也会起一个定时器检测Executor是否过期
createTaskScheduler方法创建_taskScheduler和_schedulerBackend7、调用
1)根据master来区分运行的逻辑,我们以standalone模式(spark://开头)为例讲解
2)taskscheduler实际创建的是TaskSchedulerImpl,backend实际是SparkDeploySchedulerBackend,而SparkDeploySchedulerBackend本身拓展自CoarseGrainedSchedulerBackend。CoarseGrainedSchedulerBackend是一个基于Akka Actor实现的粗粒度的资源调度类,在整个SparkJob运行期间,CoarseGrainedSchedulerBackend会监听并持有注册给它的Executor资源,并且接收Executor注册,状态更新,响应Scheduler请求等,根据现有Executor资源发起任务调度流程。总之,两者是互相协作,分工合作,共同完成整个任务调度的流程。
case SPARK_REGEX(sparkUrl) => val scheduler = new TaskSchedulerImpl(sc)//任务相关的调度 val masterUrls = sparkUrl.split(",").map("spark://" + _) val backend = new SparkDeploySchedulerBackend(scheduler, sc, masterUrls) scheduler.initialize(backend) (backend, scheduler)3)scheduler的初始化这里需要说明一下Pool的作用:每个SparkContext可能同时存在多个可运行的没有依赖关系任务集,这些任务集之间如何调度,则是由pool来决定的,默认是FIFO,其他还有Fair调度器def initialize(backend: SchedulerBackend) { this.backend = backend // temporarily set rootPool name to empty rootPool = new Pool("", schedulingMode, 0, 0) schedulableBuilder = { schedulingMode match { case SchedulingMode.FIFO => new FIFOSchedulableBuilder(rootPool) case SchedulingMode.FAIR => new FairSchedulableBuilder(rootPool, conf) } } schedulableBuilder.buildPools() }8、创建_dagScheduler,它是根据我们的程序来划分stage,构建有依赖关系的任务集。DAGscheduler内部会开启事件循环器,轮询处理接收到的事件9、调用_taskScheduler.start() -> backend.start(),创建driverEndpoint,用于向外界的交互,构建运行Executor所需要的环境,包括Appname,每个Executor上需要的cores、memory,classpath,jar以及参数,指定运行的类为org.apache.spark.executor.CoarseGrainedExecutorBackend,封装成ApplicationDescription。并将ApplicationDescription以及masters等封装成AppClient,作为App向masters提交的入口。
override def start() { super.start() // ...略 // val command = Command("org.apache.spark.executor.CoarseGrainedExecutorBackend", args, sc.executorEnvs, classPathEntries ++ testingClassPath, libraryPathEntries, javaOpts) val appUIAddress = sc.ui.map(_.appUIAddress).getOrElse("") val coresPerExecutor = conf.getOption("spark.executor.cores").map(_.toInt) val appDesc = new ApplicationDescription(sc.appName, maxCores, sc.executorMemory, command, appUIAddress, sc.eventLogDir, sc.eventLogCodec, coresPerExecutor) client = new AppClient(sc.env.actorSystem, masters, appDesc, this, conf) client.start() waitForRegistration() }查看client.start()内部,创建基于ClientActor对象的ActorRef,继续查看preStart() -> registerWithMasterdef tryRegisterAllMasters() { for (masterAkkaUrl <- masterAkkaUrls) { logInfo("Connecting to master " + masterAkkaUrl + "...") val actor = context.actorSelection(masterAkkaUrl) actor ! RegisterApplication(appDescription) } }可以看到,其实只是向masters的actorRef的发送RegisterApplication消息。我们继续看master收到这个消息如何处理?
在主master收到后,保存app的详细信息,创建appId,持久化app,并回馈RegisteredApplication消息,之后执行调度。调度流程在《spark core源码分析2 master启动流程》一节中已经介绍过了。
case RegisterApplication(description) => { if (state == RecoveryState.STANDBY) { // ignore, don't send response } else { logInfo("Registering app " + description.name) val app = createApplication(description, sender) registerApplication(app)//将app中的详细信息保存在master的内存各种数据结构中 logInfo("Registered app " + description.name + " with ID " + app.id) persistenceEngine.addApplication(app)//持久化app,用于主备切换时重构 sender ! RegisteredApplication(app.id, masterUrl) schedule()//调度 } }AppClient收到RegisteredApplication消息后,确定主master,并设置app状态为已注册,设置master传回的AppIdcase RegisteredApplication(appId_, masterUrl) => appId = appId_ registered = true changeMaster(masterUrl) listener.connected(appId)在《spark core源码分析2 master启动流程》一节中,我们讲了调度的master端的处理,当时还没有app注册上来,所以也就没有向worker发送启动Executor的命令。而此时我们已经注册了一个App了,所以master调用launchExecutor(worker, exec),向worker发送LaunchExecutor消息。同时,也会向Appclient发送ExecutorAdded消息。 worker端收到后创建工作目录,创建ExecutorRunner,ExecutorRunner启动后单独开辟一个线程处理,会根据之前包装的command启动一个进程,mainclass其实就是CoarseGrainedExecutorBackend,这些运行的参数等信息都已经被包含在appDesc中,由driver经master传递过来。处理完成之后,向master反馈ExecutorStateChanged消息
case LaunchExecutor(masterUrl, appId, execId, appDesc, cores_, memory_) => if (masterUrl != activeMasterUrl) { logWarning("Invalid Master (" + masterUrl + ") attempted to launch executor.") } else { try { logInfo("Asked to launch executor %s/%d for %s".format(appId, execId, appDesc.name)) // Create the executor's working directory val executorDir = new File(workDir, appId + "/" + execId) if (!executorDir.mkdirs()) { throw new IOException("Failed to create directory " + executorDir) } // Create local dirs for the executor. These are passed to the executor via the // SPARK_EXECUTOR_DIRS environment variable, and deleted by the Worker when the // application finishes. val appLocalDirs = appDirectories.get(appId).getOrElse { Utils.getOrCreateLocalRootDirs(conf).map { dir => Utils.createDirectory(dir, namePrefix = "executor").getAbsolutePath() }.toSeq } appDirectories(appId) = appLocalDirs val manager = new ExecutorRunner( appId, execId, appDesc.copy(command = Worker.maybeUpdateSSLSettings(appDesc.command, conf)), cores_, memory_, self, workerId, host, webUi.boundPort, publicAddress, sparkHome, executorDir, akkaUrl, conf, appLocalDirs, ExecutorState.LOADING) executors(appId + "/" + execId) = manager manager.start() coresUsed += cores_ memoryUsed += memory_ master ! ExecutorStateChanged(appId, execId, manager.state, None, None) } catch { case e: Exception => { logError(s"Failed to launch executor $appId/$execId for ${appDesc.name}.", e) if (executors.contains(appId + "/" + execId)) { executors(appId + "/" + execId).kill() executors -= appId + "/" + execId } master ! ExecutorStateChanged(appId, execId, ExecutorState.FAILED, Some(e.toString), None) } } }master收到消息后会根据Executor的状态来区分。那哪些时候会收到这些消息呢?(1)当CoarseGrainedExecutorBackend进程退出后,会向master发送ExecutorStateChanged,状态为EXITED。
(2)当AppClient收到ExecutorAdded消息后,会向master发送ExecutorStateChanged,状态为RUNNING
(3)当ExecutorRunner启动进程失败时,会向master发送ExecutorStateChanged,状态为FAILED
关于CoarseGrainedExecutorBackend进程的启动,即Executor的启动,我们下节再讲。真正的任务是运行在Executor中的,只有Executor进程正常启动之后,才能运行被分配的任务。我们先介绍_taskScheduler.start()之后的逻辑。10、下面主要就是初始化blockManager
_applicationId = _taskScheduler.applicationId() _applicationAttemptId = taskScheduler.applicationAttemptId() _conf.set("spark.app.id", _applicationId) _env.blockManager.initialize(_applicationId)def initialize(appId: String): Unit = { blockTransferService.init(this)//读取block shuffleClient.init(appId)//跟ShuffleServie有关,如果开关不打开,这里不处理 blockManagerId = BlockManagerId( executorId, blockTransferService.hostName, blockTransferService.port)//blockManager元信息 shuffleServerId = if (externalShuffleServiceEnabled) {<span style="font-family: Menlo;">//跟ShuffleServie有关,暂时不介绍</span> BlockManagerId(executorId, blockTransferService.hostName, externalShuffleServicePort) } else { blockManagerId } //向driver注册自己,注册时携带了自身的ActorRef,Driver收到后会将blockManagerId及自身的ActorRef放入hashmap中保存起来。 master.registerBlockManager(blockManagerId, maxMemory, slaveEndpoint) // Register Executors' configuration with the local shuffle service, if one should exist. if (externalShuffleServiceEnabled && !blockManagerId.isDriver) { registerWithExternalShuffleServer() } }
版权声明:本文为博主原创文章,未经博主允许不得转载。
标签:spark spark core 源码
原文地址:http://blog.csdn.net/yueqian_zhu/article/details/47977965