标签:.com www. state intern hashmap 操作 可见 partition protected
spark 2.1.1
beeline连接spark thrift之后,执行use database有时会卡住,而use database 在server端对应的是 setCurrentDatabase,
经过排查发现当时spark thrift正在执行insert操作,
org.apache.spark.sql.hive.execution.InsertIntoHiveTable
protected override def doExecute(): RDD[InternalRow] = { sqlContext.sparkContext.parallelize(sideEffectResult.asInstanceOf[Seq[InternalRow]], 1) } ... @transient private val externalCatalog = sqlContext.sharedState.externalCatalog protected[sql] lazy val sideEffectResult: Seq[InternalRow] = { ... externalCatalog.loadDynamicPartitions( externalCatalog.getPartitionOption( externalCatalog.loadPartition( externalCatalog.loadTable(
可见insert操作中可能会调用loadDynamicPartitions、getPartitionOption、loadPartition、loadTable等方法,
org.apache.spark.sql.hive.client.HiveClientImpl
def loadTable( loadPath: String, // TODO URI tableName: String, replace: Boolean, holdDDLTime: Boolean): Unit = withHiveState { ... def loadPartition( loadPath: String, dbName: String, tableName: String, partSpec: java.util.LinkedHashMap[String, String], replace: Boolean, holdDDLTime: Boolean, inheritTableSpecs: Boolean): Unit = withHiveState { ... override def setCurrentDatabase(databaseName: String): Unit = withHiveState {
而HiveClientImpl中对应的方法都会执行withHiveState,而withHiveState有synchronized,所以insert操作中的部分代码(比如loadPartition)和use database操作会被同步执行,当insert执行很慢时就会卡住所有的其他操作;
spark thrift中实现原理详见 https://www.cnblogs.com/barneywill/p/10137672.html
【原创】问题定位分享(18)beeline连接spark thrift有时会卡住
标签:.com www. state intern hashmap 操作 可见 partition protected
原文地址:https://www.cnblogs.com/barneywill/p/10145427.html