Hive基本使用表
CREATE [EXTERNAL] TABLE [IF NOT EXISTS] [db_name.]table_name [(col_name data_type [COMMENT col_comment],...)] [PARTITIONED BY(col_name data_type [COMMENT col_comment],...)] [CLUSTERED BY (col_name,col_name2,...) SORTED BY(col_name [ASC|DESC],...)] INTO num_buckets
BUCKETS]
[
[ROW FORMAT row_format][STORED AS file_format] | STORED BY ‘storage.handler.class.name‘ [WITH SERDEPROPERTIES(...)]
]
[LOCATION hdfs_path]
[AS select_statement]
EXTERNAL:是内部还是外部表
PARTITIONER BY :根据指定列进行分区
CLUSTERED BY:按照某一个字段将相同数据聚到同一个reduce中
BUCKETS:分桶,根据哈希值进行分桶
LOCATION:创建表,其文件在在集群的位置
AS:查询值同时添加到表中
例子:
CREATE TABLE IF NOT EXISTS employees(
name string,
salary float,
subordinates array<string>,
deductions map<string,float>,
address struct<street:string,city:string,state:string,zip:int>
)row format delimited fields terminated by ‘\t‘ collection items terminated by ‘,‘ map keys terminated by ‘:‘ lines terminated by ‘\n‘ stored as textfile location ‘/data/‘;