hive的简单使用说明

时间：2015-04-14 11:12:51 阅读：165 评论：0 收藏：0 [点我收藏+]

I使用：

hive：启动hive

命令必须以分号结束，告诉hive立即执行该命令，不区分大小写

show tables；查看有哪些表

desc tablename; 查看表有哪些列

写sql命令

use udw;

select user_id,action_id

from udw_ml_user_action

where partition_date>=20150410

distribute by user_id

sort by user_id,action_id

limit 10;

执行sql：hive -f test.sql

导出搜索的结果：重定向为其他格式的文件：hive -f *.sql >a.txt

II查询数据的一些命令

order by：完全排序，但只是通过一个reducer来完成，大规模数据集时，效率比较低

sort by：为每个reducer产生一个排序文件

distribute by：控制某个特定行应该到哪个reducer；例如distribute by year 保证所有具有相同年份的行最终都在同一个reducer分区中

group by：根据字段对行分组

原文地址：http://blog.csdn.net/eliza1130/article/details/45038777

踩

(0)

评论一句话评论（0）

分享档案

更多>

周排行