Expects one of [none, minimal, more].
Some select queries can be converted to single FETCH task minimizing latency.
Currently the query should be single sourced not having any subquery and should not have
any aggregations or distincts (which incurs RS), lateral views and joins.
0. none : disable hive.fetch.task.conversion
1. minimal : SELECT STAR, FILTER on partition columns, LIMIT only
2. more : SELECT, FILTER, LIMIT only (support TABLESAMPLE and virtual columns)
<description>How many tasks to run per jvm. If set to -1, there is
no limit.
<description>If true, then multiple instances of some map tasks
may be executed in parallel.</description>
<description>If true, then multiple instances of some reduce tasks
may be executed in parallel.</description>
<description>Whether speculative execution for reducers should be turned on. </description>
set hive.auto.convert.join = true; 默认为true
set hive.mapjoin.smalltable.filesize=25000000;
当选项设定为 true,生成的查询计划会有两个MR Job。第一个MR Job中,Map的输出结果会随机分布到Reduce中,每个Reduce做部分聚合操作,并输出结果,这样处理的结果是相同的Group By Key有可能被分发到不同的Reduce中,从而达到负载均衡的目的;第二个MR Job再根据预处理的数据结果按照Group By Key分布到Reduce中(这个过程可以保证相同的Group By Key被分布到同一个Reduce中),最后完成最终的聚合操作。