标签:
五、Hue、Zeppelin比较主机名 | 运行进程 |
nbidc-agent-03 | NameNode、Spark Master |
nbidc-agent-04 | SecondaryNameNode |
nbidc-agent-11 | ResourceManager、DataNode、NodeManager、Spark Worker |
nbidc-agent-12 | DataNode、NodeManager、Spark Worker |
nbidc-agent-13 | DataNode、NodeManager、Spark Worker |
nbidc-agent-14 | DataNode、NodeManager、Spark Worker |
nbidc-agent-15 | DataNode、NodeManager、Spark Worker |
nbidc-agent-18 | DataNode、NodeManager、Spark Worker |
nbidc-agent-19 | DataNode、NodeManager、Spark Worker |
nbidc-agent-20 | DataNode、NodeManager、Spark Worker |
nbidc-agent-21 | DataNode、NodeManager、Spark Worker |
nbidc-agent-22 | DataNode、NodeManager、Spark Worker |
yum install curl-devel expat-devel gettext-devel openssl-devel zlib-devel yum install gcc perl-ExtUtils-MakeMaker yum remove git cd /home/work/tools/ wget https://github.com/git/git/archive/v2.8.1.tar.gz tar -zxvf git-2.8.1.tar.gz cd git-2.8.1.tar.gz make prefix=/home/work/tools/git all make prefix=/home/work/tools/git install
scp -r jdk1.7.0_75 nbidc-agent-04:/home/work/tools/
cd /home/work/tools/ wget ftp://mirror.reverse.net/pub/apache/maven/maven-3/3.3.9/binaries/apache-maven-3.3.9-bin.tar.gz tar -zxvf apache-maven-3.3.9-bin.tar.gz
scp -r hadoop nbidc-agent-04:/home/work/tools/
scp -r spark nbidc-agent-04:/home/work/tools/
scp -r hive nbidc-agent-04:/home/work/tools/
cd /home/work/tools/ tar -jxvf phantomjs-2.1.1-linux-x86_64.tar.bz2
cd /home/work/tools/ git clone https://github.com/apache/incubator-zeppelin.git
vi /home/work/.bashrc # 添加下面的内容 export PATH=.:$PATH:/home/work/tools/jdk1.7.0_75/bin:/home/work/tools/hadoop/bin:/home/work/tools/spark/bin:/home/work/tools/hive/bin:/home/work/tools/phantomjs-2.1.1-linux-x86_64/bin:/home/work/tools/incubator-zeppelin/bin; export JAVA_HOME=/home/work/tools/jdk1.7.0_75 export HADOOP_HOME=/home/work/tools/hadoop export SPARK_HOME=/home/work/tools/spark export HIVE_HOME=/home/work/tools/hive export ZEPPELIN_HOME=/home/work/tools/incubator-zeppelin # 保存文件,并是设置生效 source /home/work/.bashrc
cd /home/work/tools/incubator-zeppelin mvn clean package -Pspark-1.6 -Dspark.version=1.6.0 -Dhadoop.version=2.7.0 -Phadoop-2.6 -Pyarn -DskipTests
cp /home/work/tools/incubator-zeppelin/conf/zeppelin-env.sh.template /home/work/tools/incubator-zeppelin/conf/zeppelin-env.sh vi /home/work/tools/incubator-zeppelin/conf/zeppelin-env.sh # 添加下面的内容 export JAVA_HOME=/home/work/tools/jdk1.7.0_75 export HADOOP_CONF_DIR=/home/work/tools/hadoop/etc/hadoop export MASTER=spark://nbidc-agent-03:7077
cp /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml.template /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml vi /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml # 修改下面这段的value值,设置zeppelin的端口为9090 <property> <name>zeppelin.server.port</name> <value>9090</value> <description>Server port.</description> </property>
cd /home/work/tools/incubator-zeppelin cp /home/work/tools/hive/conf/hive-site.xml .
zeppelin-daemon.sh start(5)测试
%sql select * from wxy.t1 where rate > ${r}第一行指定解释器为SparkSQL,第二行用${r}指定一个运行时参数,执行时页面上会出现一个文本编辑框,输入参数后回车,查询会按照指定参数进行,如图会查询rate > 100的记录。
cd /home/work/tools/ git clone https://github.com/jiekechoo/zeppelin-interpreter-mysql mvn clean package(2)部署二进制包
mkdir /home/work/tools/incubator-zeppelin/interpreter/mysql cp /home/work/tools/zeppelin-interpreter-mysql/target/zeppelin-mysql-0.5.0-incubating.jar /home/work/tools/incubator-zeppelin/interpreter/mysql/ # copy dependencies to mysql directory cp commons-exec-1.1.jar mysql-connector-java-5.1.6.jar slf4j-log4j12-1.7.10.jar log4j-1.2.17.jar slf4j-api-1.7.10.jar /home/work/tools/incubator-zeppelin/interpreter/mysql/ vi /home/work/tools/incubator-zeppelin/conf/zeppelin-site.xml在zeppelin.interpreters 的value里增加一些内容“,org.apache.zeppelin.mysql.MysqlInterpreter”,如下图所示。
zeppelin-daemon.sh restart(4)加载MySQL Interpreter
打开主页http://nbidc-agent-04:9090/,‘Interpreter’ -> ‘Create’,完成类似下图的页面,完成点击‘Save’
%mysql select date_format(create_time,‘%Y-%m-%d‘) d, count(*) c from information_schema.tables group by date_format(create_time,‘%Y-%m-%d‘) order by d;
查询结果的表格表示如下图所示。
查询结果的柱状图表示如下图所示。
查询结果的饼图表示如下图所示。
查询结果的堆叠图表示如下图所示。
查询结果的线形图表示如下图所示。
查询结果的散点图表示如下图所示。
报表模式的饼图表示如下图所示。
select date_format(create_time,‘%Y-%m-%d‘) d, count(*) c from information_schema.tables where create_time > ‘2016-06-07‘ group by date_format(create_time,‘%Y-%m-%d‘) order by d;增加了where子句,在运行此查询,结果如下图所示。
基于hadoop生态圈的数据仓库实践 —— OLAP与数据可视化(五)
标签:
原文地址:http://blog.csdn.net/wzy0623/article/details/52370045