标签:
各节点执行如下操作(或在一个节点上操作完后 scp 到其它节点):
1、 解压spark安装程序到程序目录/bigdata/soft/spark-1.4.1,约定此目录为$SPARK_HOME
tar –zxvf spark-1.4-bin-hadoop2.6.tar.gz
2、 配置spark
配置文件vi $SPARK_HOME /conf/spark-env.sh
###添加如下内容:
export JAVA_HOME=/bigdata/soft/jdk1.7.0_79
export SCALA_HOME=/bigdata/soft/scala-2.10.5
export HADOOP_CONF_DIR=/bigdata/soft/hadoop-2.6.0/etc/hadoop
export SPARK_MASTER_IP=cloud-001
#export SPARK_MASTER_PORT=7077
export SPARK_WORKER_MEMORY=1g
export SPARK_WORKER_CORES=1
export SPARK_WORKER_INSTANCES=1
export SPARK_CLASSPATH=$SPARK_CLASSPATH:/bigdata/soft/spark-1.4.1/lib/mysql-connector-java-5.1.31.jar
配置vi $SPARK_HOME /conf/slaves
##根据集群节点设置slave节点
cloud-002
cloud-003
配置vi $SPARK_HOME /conf/spark-defaults.conf
##先在hdfs上新建spark的日志目录
$Hadoop_HOME/bin/hadoop fs –mkdir /applogs
$Hadoop_HOME/bin/hadoop fs –mkdir /applogs/spark
##复制一个spark的配置文件
cp spark-defaults.conf.template spark-defaults.conf
##解注掉其中两行
spark.master spark://cloud-001:7077
spark.eventLog.enabled true
spark.eventLog.dir hdfs://cloud-001:8020/applogs/spark
配置vi $SPARK_HOME /conf/hive-site.xml
###内容基本与hive的配置一致,详见如下:
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://localhost:3306/hive_1_2_0?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>javax.jdo.PersistenceManagerFactoryClass</name>
<value>org.datanucleus.api.jdo.JDOPersistenceManagerFactory</value>
</property>
<property>
<name>javax.jdo.option.DetachAllOnCommit</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.NonTransactionalRead</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>root</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>0p;/9ol.</value>
</property>
<property>
<name>javax.jdo.option.Multithreaded</name>
<value>true</value>
</property>
<property>
<name>datanucleus.connectionPoolingType</name>
<value>BoneCP</value>
</property>
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://cloud-001:8020</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>cloud-001</value>
</property>
</configuration>
复制一个mysql的jdbc驱动到$SPARK_HOME/lib
如cp $HIVE_HOME/lib/mysql-connector-java-5.1.31.jar $SPARK_HOME/lib
3、 standlone 模式启动集群
启动master和worker:
$SPARK_HOME/sbin/start-all.sh
启动spark的hive服务
$SPARK_HOME/sbin/start-thriftserver.sh --master spark://cloud-001:7077 --driver-memory 1g --executor-memory 1g --total-executor-cores 2
4、 测试
测试spark-shell
$SPARK_HOME/bin/spark-shell --master spark://cloud-001:7077 --driver-memory 1g --executor-memory 1g --total-executor-cores 2
测试spark-sql
$SPARK_HOME/bin/spark-sql --master spark://cloud-001:7077 --driver-memory 1g --executor-memory 1g --total-executor-cores 2
或者
$SPARK_HOME/bin/beeline -u jdbc:hive2://cloud-001:10000 -n hadoop
标签:
原文地址:http://my.oschina.net/yyflyons/blog/491545