标签:
orc | orc (split 110M) | parquet +snappy | parquet +gzip | |
spark-sql 1.4 | 2mins, 7sec | 1mins,40sec | Parquet does not support decimal | Parquet does not support decimal |
spark-sql 1.6 | 1mins, 30sec | 大概1mins,4sec | 大概1mins,4sec | 大概1mins,4sec |
hive | 20mins | 18.5mins | 大概20mins | 大概20mins |
所占空间(raw倍数) | 1 | 1 | 1.6 | 1 |
spark-sql 1.6保持分配600G的内存不变,在不同数据量下进行测试:
|
200G
|
550G
|
1.1T
|
---|---|---|---|
spark-sql 1.4 | 11-12mins | ||
spark-sql 1.6 | 7-8mins | 22mins | 51mins |
hive | 15mins | 50mins | 将近5T内存,就没测试 |
3) 听单
|
time
|
---|---|
spark-sql 1.6 | 190s |
hive | 1117s |
4)
三,总结
标签:
原文地址:http://www.cnblogs.com/shoudi/p/5564119.html