hive有textFile,SequenceFile,RCFile三种文件格式。
textfile为默认格式,建表时不指定默认为这个格式,导入数据时会直接把数据文件拷贝到hdfs上不进行处理。
SequenceFile,RCFile格式的表不能直接从本地文件导入数据,数据要先导入到textfile格式的表中,
然后再从textfile表中用insert导入到SequenceFile,RCFile表中。
create table zone0000tf(ra int, dec int, mag int) row format delimited fields
terminated by ‘|‘;
create table zone0000rc(ra int, dec int, mag int) row
format delimited fields terminated by ‘|‘ stored as rcfile;
load data
local inpath ‘/home/cq/usnoa/zone0000.asc ‘ into table zone0000tf;
insert
overwrite table zone0000rc select * from zone0000tf;(begin a job)
File Format
TextFile |
SequenceFIle |
RCFFile | |
Data type |
Text Only |
Text/Binary |
Text/Binary |
Internal Storage Order |
Row-based |
Row-based |
Column-based |
Compression |
File Based |
Block Based |
Block Based |
Splitable |
YES |
YES |
YES |
Splitable After Compression |
No |
YES |
YES |
原文地址:http://www.cnblogs.com/liutoutou/p/3732148.html