标签:rmi where mit _id 案例 esc into replace 设置
蚂蚁森林案例背景说明
数据样例
u_001 2017/1/1 10
u_001 2017/1/2 150
u_001 2017/1/2 110
plant_carbon.txt 记录申领环保植物所需要减少的碳排放量
数据样例
p001 梭梭树 17
p002 沙柳 19
p003 樟子树 146
p004 胡杨 215
表名:user_low_carbon
字段说明
user_id:用户
data_dt:日期
low_carbon:减少碳排放(g)
表名:plant_carbon
字段说明
plant_id:植物编号
plant_name:植物名
low_carbon:换购植物所需要的碳
创建表
hive (default)> create table user_low_carbon(user_id String,
data_dt String,
low_carbon int
)
row format delimited fields terminated by ‘\t‘;
导入数据
load data local inpath "/opt/module/data/user_low_carbon.txt" into table user_low_carbon;
load data local inpath "/opt/module/data/plant_carbon.txt" into table plant_carbon;
设置本地模式
hive (default)> set hive.exec.mode.local.auto=true;
假设2017年1月1日开始记录低碳数据(user_low_carbon),假设2017年10月1日之前满足申领条件的用户都申领了一颗p004-胡杨,剩余的能量全部用来领取“p002-沙柳” 。统计在10月1日累计申领“p002-沙柳” 排名前10的用户信息;以及他比后一名多领了几颗沙柳。
hive (default)> select user_id, sum(low_carbon) sum_carbon
from user_low_carbon
where date_format(regexp_replace(data_dt, ‘/‘, ‘-‘), ‘yyyy-MM-dd‘) < ‘2017-10-01‘
group by user_id;
输出结果:
user_id sum_carbon
u_001 475
u_002 659
u_003 620
u_004 640
u_005 1100
u_006 830
u_007 1470
u_008 1240
u_009 930
u_010 1080
u_011 960
u_012 250
u_013 1430
u_014 1060
u_015 290
select low_carbon from plant_carbon where plant_id=‘004‘;
select low_carbon from plant_carbon where plant_id=‘002‘;
hive (default)> select user_id,
floor((t1.sum_carbon - t2.low_carbon) / t3.low_carbon) count_p002
from (
select user_id, sum(low_carbon) sum_carbon
from user_low_carbon
where date_format(regexp_replace(data_dt, ‘/‘, ‘-‘), ‘yyyy-MM-dd‘) < ‘2017-10-01‘
group by user_id
) t1,
(
select low_carbon
from plant_carbon
where plant_id = ‘p004‘
) t2,
(
select low_carbon
from plant_carbon
where plant_id = ‘p002‘
) t3;
输出结果:
user_id count_p002
u_001 13
u_002 23
u_003 21
u_004 22
u_005 46
u_006 32
u_007 66
u_008 53
u_009 37
u_010 45
u_011 39
u_012 1
u_013 63
u_014 44
u_015 3
统计在10月1日累计申领“p002-沙柳” 排名前10的用户信息
hive (default)> select user_id,
count_p002,
lead(count_p002, 1) over (order by count_p002 desc) lead_1_p002
from (
select user_id,
floor((t1.sum_carbon - t2.low_carbon) / t3.low_carbon) count_p002
from (
select user_id, sum(low_carbon) sum_carbon
from user_low_carbon
where date_format(regexp_replace(data_dt, ‘/‘, ‘-‘), ‘yyyy-MM-dd‘) < ‘2017-10-01‘
group by user_id
) t1,
(
select low_carbon
from plant_carbon
where plant_id = ‘p004‘
) t2,
(
select low_carbon
from plant_carbon
where plant_id = ‘p002‘
) t3
) t4
limit 10;
输出结果:
user_id count_p002 lead_1_p002
u_007 66 63
u_013 63 53
u_008 53 46
u_005 46 45
u_010 45 44
u_014 44 39
u_011 39 37
u_009 37 32
u_006 32 23
u_002 23 22
hive (default)> select user_id,
count_p002,
(count_p002 - lead_1_p002) diff_count
from (
select user_id,
count_p002,
lead(count_p002, 1) over (order by count_p002 desc) lead_1_p002
from (
select user_id,
floor((t1.sum_carbon - t2.low_carbon) / t3.low_carbon) count_p002
from (
select user_id, sum(low_carbon) sum_carbon
from user_low_carbon
where date_format(regexp_replace(data_dt, ‘/‘, ‘-‘), ‘yyyy-MM-dd‘) < ‘2017-10-01‘
group by user_id
) t1,
(
select low_carbon
from plant_carbon
where plant_id = ‘p004‘
) t2,
(
select low_carbon
from plant_carbon
where plant_id = ‘p002‘
) t3
) t4
limit 10
) t5
order by count_p002 desc;
输出结果:
user_id count_p002 diff_count
u_007 66 3
u_013 63 10
u_008 53 7
u_005 46 1
u_010 45 1
u_014 44 5
u_011 39 2
u_009 37 5
u_006 32 9
u_002 23 1
标签:rmi where mit _id 案例 esc into replace 设置
原文地址:https://www.cnblogs.com/eugene0/p/13296706.html