标签:
通过word count在spark-shell中执行的过程,我们想看看spark-shell做了什么?spark-shell中有以下一段脚本,见代码清单1-1。
代码清单1-1 spark-shell
1
2
3
4
5
6
7
8
9
10
11
|
function main() { if $cygwin; then stty -icanonmin 1 - echo > /dev/null 2>&1 export SPARK_SUBMIT_OPTS= "$SPARK_SUBMIT_OPTS -Djline.terminal=unix" "$FWDIR" /bin/spark-submit --class org.apache.spark.repl.Main "${SUBMISSION_OPTS[@]}" spark-shell "${APPLICATION_OPTS[@]}" sttyicanon echo > /dev/null 2>&1 else export SPARK_SUBMIT_OPTS "$FWDIR" /bin/spark-submit --class org.apache.spark.repl.Main "${SUBMISSION_OPTS[@]}" spark-shell "${APPLICATION_OPTS[@]}" fi } |
我们看到脚本spark-shell里执行了spark-submit脚本,那么打开spark-submit脚本,发现其中包含以下脚本。
1
|
exec "$SPARK_HOME" /bin/spark-class org.apache.spark.deploy.SparkSubmit "${ORIG_ARGS[@]}" |
脚本spark-submit在执行spark-class脚本时,给它增加了参数SparkSubmit 。打开spark-class脚本,其中包含以下脚本,见代码清单1-2。
代码清单1-2 spark-class
1
2
3
4
5
6
7
8
9
10
11
12
|
if [ -n "${JAVA_HOME}" ]; then RUNNER= "${JAVA_HOME}/bin/java" else if [ ` command - v java` ]; then RUNNER= "java" else echo "JAVA_HOME is not set" >&2 exit 1 fi fi exec "$RUNNER" - cp "$CLASSPATH" $JAVA_OPTS "$@" |
读到这,应该知道Spark启动了以SparkSubmit为主类的jvm进程。
注:spark-shell是linux shell的最佳实践
标签:
原文地址:http://www.cnblogs.com/ducong/p/5263862.html