标签:
(假定已经装好的hadoop,不管你装没装好,反正我是装好了)
1 下载spark安装包 http://spark.apache.org/downloads.html
下载spark-1.6.1-bin-without-hadoop.tgz
2 将spark包解压到自己指定目录
然后在spark中指定hadoop jar包的路径
先执行 hadoop classpath把输出内容记下来
在spark conf路径下新建spark-env.sh
然后输入以下内容:(注意:::::::把hadoop classpath 替换为刚才命令的输出内容)
### in conf/spark-env.sh ###
# If ‘hadoop‘ binary is on your PATH
export SPARK_DIST_CLASSPATH=$(hadoop classpath)
# With explicit path to ‘hadoop‘ binary
export SPARK_DIST_CLASSPATH=$(/path/to/hadoop/bin/hadoop classpath)
# Passing a Hadoop configuration directory
export SPARK_DIST_CLASSPATH=$(hadoop --config /path/to/configs classpath)
3 然后你就可以执行pyspark之类的进入命令行了
标签:
原文地址:http://www.cnblogs.com/earendil/p/5582951.html