标签:tez
Table of Contents
在MRR和MPJ计算模型的处理上,TEZ能够提升的性能较为明显,具体测试如下:
测试表格
1.users(id,name,password): 数据总量1千万条记录;
2.peoples(id,name,gender,address): 数据总量1千万条记录;
3.gender_summary(gender,count)
3.address_summary(address,count)
测试语句
FROM (SELECT u.username, p.sex, p.address FROM users u JOIN peoples p ON u.userid = p.id) subql INSERT OVERWRITE TABLE gender_summary SELECT subql.sex, count(*) GROUP BY subql.sex INSERT OVERWRITE TABLE address_summary SELECT subql.address, count(*) GROUP BY subql.address;
DAG有向无环图如下:
执行结果
基于MapReduce运行
MapReduce Jobs Launched: Stage-Stage-2: Map: 2 Reduce: 3 Cumulative CPU: 220.78 sec Stage-Stage-3: Map: 1 Reduce: 1 Cumulative CPU: 4.23 sec Stage-Stage-4: Map: 1 Reduce: 1 Cumulative CPU: 4.08 sec Total MapReduce CPU Time Spent: 3 minutes 49 seconds 90 msec Time taken: 186.853 seconds 3次执行分别用时:186.853、188.748、191.812,平均用时:189.13秒。
基于TEZ运行
-------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 5 5 0 0 0 0 Map 5 .......... SUCCEEDED 6 6 0 0 0 0 Reducer 2 ...... SUCCEEDED 2 2 0 0 0 0 Reducer 3 ...... SUCCEEDED 1 1 0 0 0 0 Reducer 4 ...... SUCCEEDED 1 1 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 05/05 [==========================>>] 100% ELAPSED TIME: 56.23 s -------------------------------------------------------------------------------- Time taken: 60.348 seconds 3次执行分别用时:60.348、60.441、61.311,平均用时:60.7秒。
时间效率上提升了近3倍左右。
测试表格
1.users(id,name,password): 数据总量1千万条记录;
2.peoples(id,name,gender,address): 数据总量1千万条记录;
3.permissions(userid,name)
测试语句
SELECT u.userid, p.name, q.name FROM users u JOIN peoples p ON u.userid = p.id JOIN permissions q ON p.id = q.userId;
DAG有向无环图如下:
执行结果
基于MapReduce运行
MapReduce Jobs Launched: Stage-Stage-1: Map: 3 Reduce: 3 Cumulative CPU: 177.33 sec Total MapReduce CPU Time Spent: 2 minutes 57 seconds 330 msec OK Time taken: 104.208 seconds, Fetched: 5 row(s) 3次执行分别用时:104.208、102.146、103.537。平均用时:103.297秒。
基于TEZ运行
-------------------------------------------------------------------------------- VERTICES STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED -------------------------------------------------------------------------------- Map 1 .......... SUCCEEDED 5 5 0 0 0 0 Map 3 .......... SUCCEEDED 6 6 0 0 0 0 Map 4 .......... SUCCEEDED 1 1 0 0 0 0 Reducer 2 ...... SUCCEEDED 2 2 0 0 0 0 -------------------------------------------------------------------------------- VERTICES: 04/04 [==========================>>] 100% ELAPSED TIME: 47.50 s -------------------------------------------------------------------------------- OK Time taken: 49.143 seconds, Fetched: 5 row(s) 3次执行分别用时:49.143、47.284、48.578。平均用时:48.335秒。
时间效率上提升了2倍多。
版权声明:本文为博主原创文章,未经博主允许不得转载。
标签:tez
原文地址:http://blog.csdn.net/javaman_chen/article/details/46980031