标签:style blog class code c java
本文聊下提供数据在线的一些经验
典型场景:数据组的同学每天凌晨在Hive上基于历史数据计算用户行为数据,通过工具将该数据推送到HBase,业务方通过RPC Service获取用户行为数据
/** * 单个查询:根据表名、key、family、column返回value * * @param tableName * 表名 * @param key * 查询的key * @param family * 列所在的family(规范定义:“dim”) * @param column * 列名 * @return * 返回指定列的value, 指定列不存在返回null * @throws * IOException 远程调用失败或网络异常 */ public String query(String tableName, String key, String family, String column) throws IOException;经验和问题:
TableName | Family | Key | Columns |
在HBase中,上述概念和HBase一致
在Reids/Memcache中:
TableName + Family + Key组织成为物理存储的Key
Columns被序列化为Json
从而统一查询接口:
/** * 单个查询:根据表名、key、family、column list返回value * * @param tableName * 表名 * @param key * 查询的key * @param family * 列所在的family(默认:“dim”) * @param columnList * 列List * @return * 返回指定列(json格式) * @throws * IOException 远程调用失败或网络异常 */ public String query(String tableName, String family, String key, List<String> columnList) throws IOException;
示例,一张Hive中的表:
HIVE > select * from bi.dprpt_user_city_profile_service limit 1;
OK
1_1 2010-09-02 2014-05-02 2010-09-02 2014-05-02 2011-06-01 2013-09-27 2010-06-08 2013-09-16 2012-08-29 2013-07-08 2 101 842
在HBase中保存为:
hbase(main):001:0> get ‘bi.dprpt_user_city_profile_service‘,‘1_1‘
COLUMN CELL
dim:first_app_tg_date timestamp=1400025620039, value=2012-08-29
dim:first_app_visit_date timestamp=1400025620039, value=2010-09-02
dim:first_tg_date timestamp=1400025620039, value=2010-06-08
dim:first_tg_visit_date timestamp=1400025620039, value=2011-06-01
dim:first_visit_date timestamp=1400025620039, value=2010-09-02
dim:last_app_tg_date timestamp=1400025620039, value=2013-07-08
dim:last_app_visit_date timestamp=1400025620039, value=2014-05-02
dim:last_tg_date timestamp=1400025620039, value=2013-09-16
dim:last_tg_visit_date timestamp=1400025620039, value=2013-09-27
dim:last_visit_date timestamp=1400025620039, value=2014-05-02
dim:prefer_tg_cat0 timestamp=1400025620039, value=2
dim:prefer_tg_cat1 timestamp=1400025620039, value=101
dim:prefer_tg_region timestamp=1400025620039, value=842
在Redis/Memcache中保存为(7036447591228069586为tablename+family+key做MurmurHash之后的值):
Redis > get 7036447591228069586
"{"prefer_tg_cat0":"2","last_tg_date":"2013-09-16","last_tg_visit_date":"2013-09-27","last_app_visit_date":"2014-05-02","first_app_tg_date":"2012-08-29","first_visit_date":"2010-09-02","last_app_tg_date":"2013-07-08","prefer_tg_cat1":"101","prefer_tg_region":"842","first_tg_date":"2010-06-08","last_visit_date":"2014-05-02","first_app_visit_date":"2010-09-02","first_tg_visit_date":"2011-06-01"}"
AVG | 95线 | 99.9线 | |
HBase | 2.2 | 7.8 | 62.6 |
Memcache | 0.7 | 1.0 | 3.6 |
Redis | 0.3 | 0.6 | 1.2 |
标签:style blog class code c java
原文地址:http://blog.csdn.net/yfkiss/article/details/25688153