spark standalone mode 安装

时间：2015-04-07 15:16:18 阅读：143 评论：0 收藏：0 [点我收藏+]

标签：

1.安装JDK

2.安装scala 2.10

spark-1.0.2 依赖 scala 2.10, 我们必须要安装scala 2.10.

下载 scala-2.10.*.tgz 并保存到home目录(已经在sg206上）.
$ tar -zxvf scala-2.10.*.tgz
$ sudo mv scala-2.10.*.tgz /usr/lib
$ sudo vim ~/.bash_profile
# add the following lines at the end
export SCALA_HOME=/usr/lib/scala-2.10.*.tgz
export PATH=$PATH:$SCALA_HOME/bin
# save and exit vim
#make the bash profile take effect immediately
source ~/.bash_profile
# test
$ scala -version

3.building spark

cd /home

tar -zxf spark-0.7.3-sources.gz

cd spark-0.7.3

sbt/sbt package (需要git环境 yum install git）

或者下载spark-1.0.2-bin-hadoop2.tgz

4.配置文件

spark-env.sh

############

export SCALA_HOME=/usr/lib/scala-2.9.3
export SPARK_MASTER_IP=172.16.48.202
export SPARK_WORKER_MEMORY=10G

export JAVA_HOME=***

#############

slaves

将从节点IP添加至slaves配置文件

5.启动和停止

bin/start-master.sh - Starts a master instance on the machine the script is executed on.
bin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file.
bin/start-all.sh - Starts both a master and a number of slaves as described above.
bin/stop-master.sh - Stops the master that was started via the bin/start-master.sh script.
bin/stop-slaves.sh - Stops the slave instances that were started via bin/start-slaves.sh.
bin/stop-all.sh - Stops both the master and the slaves as described above.

6. 浏览master的web UI(默认http://localhost:8080). 这是你应该可以看到所有的word节点，以及他们的CPU个数和内存等信息。

7. 测试：

连接spark：spark-shell --master spark://192.168.148.42:7077

输入命令：　　

　　var file = sc.textFile("　　")

　　val info = file.filter(line => line.contains("INFO"))

　　info.count()

命令测试

spark-submit --master spark://192.168.148.42:7077 examples/src/main/python/pi.py 10

写程序测试：

import org.apache.spark.SparkConf;
import org.apache.spark.api.java.JavaRDD;
import org.apache.spark.api.java.JavaSparkContext;
import org.apache.spark.api.java.function.Function;
public class AdminOperation  {

   public static void main(String []args)
   {
       SparkConf conf = new SparkConf().setAppName("atest").setMaster("spark://192.168.148.42:7077");
       JavaSparkContext sc = new JavaSparkContext(conf);
       JavaRDD<String> file = sc.textFile("hdfs://huangcun-hbase1:8020/test/test.txt");
       JavaRDD<String> errors = file.filter(new Function<String, Boolean>() {
         public Boolean call(String s) { return s.contains("ERROR"); }
       });
       // Count all the errors
       errors.count();
   }
}

spark-submit --master spark://192.168.148.42:7077 spark-test.jar --class AdminOperation

另附提交参数：

spark-submit --master spark://BJJR-FANWEIWE1.360buyAD.local:7077 --class AdminOperation --executor-memory 20G --total-executor-cores 100 C:\Users\Administrator.BJXX-20140806JH\spark-test.jar --jars 依赖的库文件

spark standalone mode 安装

标签：

原文地址：http://www.cnblogs.com/fanweiwei/p/4172136.html

踩

(0)

评论一句话评论（0）

分享档案

更多>

2021年07月29日 (22)
2021年07月28日 (40)
2021年07月27日 (32)
2021年07月26日 (79)
2021年07月23日 (29)
2021年07月22日 (30)
2021年07月21日 (42)
2021年07月20日 (16)
2021年07月19日 (90)
2021年07月16日 (35)

周排行