所谓逆回购:通俗来讲,就是你(A)把钱借给别人(B),到期时,B按照约定利息,还给你(A)本资+利息。逆回购本身是无风险的。(操作银行储蓄存款类似)。现在火热吵起来的,阿里金融的“余额宝”利息与逆回购持平。我们可以猜测“余额 宝”的资金也在操作“逆回购”,不仅保持良好的流通性,同时也提供稳定的利息。






  tradedate: 交易日期

  tradetime: 交易时间

  stockid: 股票id

  buyprice: 买入价格

  buysize: 买入数量

  sellprice: 卖出价格

  sellsize: 卖出数量









        在hive中,创建 stock 表结构。

hive> create table if not exists stock(tradedate string, tradetime string, stockid string, buyprice double, buysize int, sellprice string, sellsize int) row format delimited fields terminated by ‘,‘ stored as textfile;
Time taken: 0.207 seconds
hive> desc stock;
tradedate               string                                      
tradetime               string                                      
stockid                 string                                      
buyprice                double                                      
buysize                 int                                         
sellprice               string                                      
sellsize                int                                         
Time taken: 0.147 seconds, Fetched: 7 row(s)



[hadoop@master bin]$ cd /home/hadoop/test/
[hadoop@master test]$ sudo rz
hive> load data local inpath ‘/home/handoop/test/stock.csv’ into table stock;


        创建分区表 stock_partition,用日期做为分区表的分区ID。

hive> create table if not exists stock_partition(tradetime string, stockid string, buyprice double, buysize int, sellprice string, sellsize int) partitioned by (tradedate string) row format delimited fields terminated by ‘,‘;                  
Time taken: 0.112 seconds
hive> desc stock_partition;
tradetime               string                                      
stockid                 string                                      
buyprice                double                                      
buysize                 int                                         
sellprice               string                                      
sellsize                int                                         
tradedate               string                                      
# Partition Information          
# col_name                data_type               comment             
tradedate               string



hive>set hive.exec.dynamic.partition.mode=nonstrict;



hive> insert overwrite table stock_partition partition(tradedate) select tradetime, stockid, buyprice, buysize, sellprice, sellsize, tradedate from stock distribute by tradedate;
Query ID = hadoop_20180524122020_f7a1b61a-84ed-4487-a37e-64ef9c3abc5f
Total jobs = 1
Launching Job 1 out of 1
Number of reduce tasks not specified. Estimated from input data size: 1
In order to change the average load for a reducer (in bytes):
  set hive.exec.reducers.bytes.per.reducer=<number>
In order to limit the maximum number of reducers:
  set hive.exec.reducers.max=<number>
In order to set a constant number of reducers:
  set mapreduce.job.reduces=<number>
Starting Job = job_1527103938304_0002, Tracking URL = http://master:8088/proxy/application_1527103938304_0002/
Kill Command = /opt/modules/hadoop-2.6.0/bin/hadoop job  -kill job_1527103938304_0002
Hadoop job information for Stage-1: number of mappers: 1; number of reducers: 1
2018-05-24 12:20:13,931 Stage-1 map = 0%,  reduce = 0%
2018-05-24 12:20:21,434 Stage-1 map = 100%,  reduce = 0%, Cumulative CPU 2.19 sec
2018-05-24 12:20:40,367 Stage-1 map = 100%,  reduce = 100%, Cumulative CPU 5.87 sec
MapReduce Total cumulative CPU time: 5 seconds 870 msec
Ended Job = job_1527103938304_0002
Loading data to table default.stock_partition partition (tradedate=null)
     Time taken for load dynamic partitions : 492
    Loading partition {tradedate=20130726}
    Loading partition {tradedate=20130725}
    Loading partition {tradedate=20130724}
    Loading partition {tradedate=20130723}
    Loading partition {tradedate=20130722}
     Time taken for adding to write entity : 6
Partition default.stock_partition{tradedate=20130722} stats: [numFiles=1, numRows=25882, totalSize=918169, rawDataSize=892287]
Partition default.stock_partition{tradedate=20130723} stats: [numFiles=1, numRows=26516, totalSize=938928, rawDataSize=912412]
Partition default.stock_partition{tradedate=20130724} stats: [numFiles=1, numRows=25700, totalSize=907048, rawDataSize=881348]
Partition default.stock_partition{tradedate=20130725} stats: [numFiles=1, numRows=20910, totalSize=740877, rawDataSize=719967]
Partition default.stock_partition{tradedate=20130726} stats: [numFiles=1, numRows=24574, totalSize=862861, rawDataSize=838287]
MapReduce Jobs Launched: 
Stage-Stage-1: Map: 1  Reduce: 1   Cumulative CPU: 5.87 sec   HDFS Read: 5974664 HDFS Write: 4368260 SUCCESS
Total MapReduce CPU Time Spent: 5 seconds 870 msec
Time taken: 39.826 seconds



        Hive 自定义Max统计最大值。

package zimo.hadoop.hive;

import org.apache.hadoop.hive.ql.exec.UDF;

* @function 自定义UDF统计最大值
* @author Zimo
public class Max extends UDF{

    public Double evaluate(Double a, Double b) {
        if(a == null)
        if(b == null)
        if(a >= b){
            return a;
        } else {
            return b;


        Hive 自定义Min统计最小值。

package zimo.hadoop.hive;

import org.apache.hadoop.hive.ql.exec.UDF;

 * @function 自定义UDF统计最小值
 * @author Zimo
public class Min  extends UDF{

    public Double evaluate(Double a, Double b) {
        if(a == null)
            a = 0.0;
        if(b == null)
            b = 0.0;
        if(a >= b){
            return b;
        } else {
            return a;


        将自定义的Max和Min分别打包成maxUDF.jar和minUDF.jar, 然后上传至/home/hadoop/hive目录下,添加Hive自定义的UDF函数

[hadoop@master ~]$ cd $HIVE_HOME
[hadoop@master hive1.0.0]$ sudo mkdir jar/
[hadoop@master hive1.0.0]$ ll
total 408
drwxr-xr-x 4 hadoop hadoop   4096 May 24 06:15 bin
drwxr-xr-x 2 hadoop hadoop   4096 May 24 05:53 conf
drwxr-xr-x 4 hadoop hadoop   4096 May 14 23:28 examples
drwxr-xr-x 7 hadoop hadoop   4096 May 14 23:28 hcatalog
drwxrwxr-x 3 hadoop hadoop   4096 May 24 11:50 iotmp
drwxr-xr-x 2 root   root     4096 May 24 12:34 jar
drwxr-xr-x 4 hadoop hadoop   4096 May 14 23:41 lib
-rw-r--r-- 1 hadoop hadoop  23828 Jan 30  2015 LICENSE
drwxr-xr-x 2 hadoop hadoop   4096 May 24 03:36 logs
-rw-r--r-- 1 hadoop hadoop    397 Jan 30  2015 NOTICE
-rw-r--r-- 1 hadoop hadoop   4044 Jan 30  2015 README.txt
-rw-r--r-- 1 hadoop hadoop 345744 Jan 30  2015 RELEASE_NOTES.txt
drwxr-xr-x 3 hadoop hadoop   4096 May 14 23:28 scripts
[hadoop@master hive1.0.0]$ cd jar/
[hadoop@master jar]$ sudo rz
[hadoop@master jar]$ ll
total 8
-rw-r--r-- 1 root root 714 May 24  2018 maxUDF.jar
-rw-r--r-- 1 root root 713 May 24  2018 minUDF.jar
> add jar /opt/modules/hive1.0.0/jar/maxUDF.jar; Added [/opt/modules/hive1.0.0/jar/maxUDF.jar] to class path Added resources: [/opt/modules/hive1.0.0/jar/maxUDF.jar]
> add jar /opt/modules/hive1.0.0/jar/minUDF.jar; Added [/opt/modules/hive1.0.0/jar/minUDF.jar] to class path Added resources: [/opt/modules/hive1.0.0/jar/minUDF.jar]



hive> create temporary function maxprice as ‘zimo.hadoop.hive.Max‘;
Time taken: 0.009 seconds
hive> create temporary function minprice as ‘zimo.hadoop.hive.Min‘;
Time taken: 0.004 seconds



hive> select stockid, tradedate, max(maxprice(buyprice,sellprice)), min(minprice(buyprice,sellprice)) from stock_partition where stockid=‘204001‘ group by tradedate;
                    204001  20130722        4.05    0.0
                    204001  20130723        4.48    2.2
                    204001  20130724        4.65    2.205
                    204001  20130725        11.9    8.7
                    204001  20130726        12.3    5.2




hive> select stockid, tradedate, substring(tradetime,0,4), sum(buyprice+sellprice)/(count(*)*2) from stock_partition where stockid=‘204001‘ group by stockid, tradedate, substring(tradetime,0,4); 
                    204001  20130725        0951    9.94375
                    204001  20130725        0952    9.959999999999999
                    204001  20130725        0953    10.046666666666667
                    204001  20130725        0954    10.111041666666667
                    204001  20130725        0955    10.132500000000002
                    204001  20130725        0956    10.181458333333333
                    204001  20130725        0957    10.180625
                    204001  20130725        0958    10.20340909090909
                    204001  20130725        0959    10.287291666666667
                    204001  20130725        1000    10.331041666666668
                    204001  20130725        1001    10.342500000000001
                    204001  20130725        1002    10.344375
                    204001  20130725        1003    10.385
                    204001  20130725        1004    10.532083333333333
                    204001  20130725        1005    10.621041666666667
                    204001  20130725        1006    10.697291666666667
                    204001  20130725        1007    10.702916666666667
                    204001  20130725        1008    10.78





