码迷,mamicode.com
首页 > 其他好文 > 详细

时间序列

时间:2017-08-20 17:06:12      阅读:303      评论:0      收藏:0      [点我收藏+]

标签:日历   start   value   from   2-2   plot   nbsp   [1]   package   

时间序列基础频率表

别名 偏移量类型 说明
D Day 每日历日
B BusinessDay 每工作日
H Hour 每小时
T或min Minute 每分
S Second 每秒
L或ms Milli 每毫秒
U   每微秒
M   每月最后一个日历日
BM   每月最后一个工作日
W-MON,W-TUE...   从指定的星期几(MON,TUE...)开始算起,每周
W-1MON,WOM-2MON...   每月第几个星期几,WOM-2MON即每月第2个星期一
Q-JAN,Q-FEB...   对于指定月份结束的年度,每季度最后一个月的最后一个日历日
BQ-JAN,BQ-FEB...   对于指定月份结束的年度,每季度最后一个月的最后一个工作日
QS-JAN,QS-FEB...   对于指定月份结束的年度,每季度最后一个月的第一个日历日
BQS-JAN,BQS-FEB...   对于指定月份结束的年度,每季度最后一个月的第一个工作日
A-JAN,A-FEB...   每年指定月份的最后一个日历日
BA-JAN,BA-FEB...   每年指定月份的最后一个工作日
AS-JAN,AS-FEB...   每年指定月份的第一个日历日
BAS-JAN,BAS-FEB...   每年指定月份的第一个工作日

 

时间序列基础

基本导入

In [1]: import numpy as np
In [2]: import pandas as pd
In [3]: import matplotlib.pyplot as plt
In [4]: import datetime as dt
In [5]: from pandas import Series,DataFrame
In [6]: from datetime import datetime

 

1.时间序列基础

In [9]: dates = [datetime(2011,1,2),datetime(2011,1,5),datetime(2011,1,7),
   ...: datetime(2011,1,8),datetime(2011,1,10),datetime(2011,1,12)]

In [10]: ts = Series(np.random.randn(6),index = dates) #时间序列对象
In [11]: ts
Out[11]:
2011-01-02   -0.535451
2011-01-05    2.177724
2011-01-07    1.894591
2011-01-08    0.163426
2011-01-10    1.171710
2011-01-12    0.131111
dtype: float64

In [12]: type(ts)   
Out[12]: pandas.core.series.Series

In [13]: ts.index
Out[13]:
DatetimeIndex([2011-01-02, 2011-01-05, 2011-01-07, 2011-01-08,
               2011-01-10, 2011-01-12],
              dtype=datetime64[ns], freq=None)

In [14]: ts + ts[::2]   #时间序列运算
Out[14]:
2011-01-02   -1.070903
2011-01-05         NaN
2011-01-07    3.789183
2011-01-08         NaN
2011-01-10    2.343421
2011-01-12         NaN
dtype: float64

In [15]: ts.index.dtype
Out[15]: dtype(<M8[ns])

In [16]: ts.index[0]
Out[16]: Timestamp(2011-01-02 00:00:00)

In [17]: stamp = ts.index[2]      #时间戳对象
In [18]: ts[stamp],\n
Out[18]: (1.8945912547163455, \n)

In [19]: ts[1/10/2011]   #时间序列索引
Out[19]: 1.1717104597202732

In [24]: ts[datetime(2011,1,7):]   #时间序列切片
Out[24]:
2011-01-07    1.894591
2011-01-08    0.163426
2011-01-10    1.171710
2011-01-12    0.131111
dtype: float64

In [25]: ts.truncate(after = 1/9/2011)
Out[25]:
2011-01-02   -0.535451
2011-01-05    2.177724
2011-01-07    1.894591
2011-01-08    0.163426
dtype: float64

2.日期范围

生成日期范围基本语法:

dates = pd.date_range(1/1/2000‘,periods = 10,freq = W-WED)
dates = pd.date_range(‘1/1/2000‘,‘2/2/2000‘,freq = ‘10h‘)
dates = pd.date_range(start = ‘1/1/2000‘,periods = 10)

查询日期:dates.ix[2001-01]
freq表示频率和日期偏移
In [43]: index = pd.date_range(4/1/2015‘,5/1/2015)
In [44]: index
Out[44]:
DatetimeIndex([2015-04-01‘, 2015-04-02‘, 2015-04-03‘, 2015-04-04,
               2015-04-05‘, 2015-04-06‘, 2015-04-07‘, 2015-04-08,
               2015-04-09‘, 2015-04-10‘, 2015-04-11‘, 2015-04-12,
               2015-04-13‘, 2015-04-14‘, 2015-04-15‘, 2015-04-16,
               2015-04-17‘, 2015-04-18‘, 2015-04-19‘, 2015-04-20,
               2015-04-21‘, 2015-04-22‘, 2015-04-23‘, 2015-04-24,
               2015-04-25‘, 2015-04-26‘, 2015-04-27‘, 2015-04-28,
               2015-04-29‘, 2015-04-30‘, 2015-05-01],
              dtype=datetime64[ns]‘, freq=D)

In [45]: pd.date_range(1/1/2016‘,periods = 31)
Out[45]:
DatetimeIndex([2016-01-01‘, 2016-01-02‘, 2016-01-03‘, 2016-01-04,
               2016-01-05‘, 2016-01-06‘, 2016-01-07‘, 2016-01-08,
               2016-01-09‘, 2016-01-10‘, 2016-01-11‘, 2016-01-12,
               2016-01-13‘, 2016-01-14‘, 2016-01-15‘, 2016-01-16,
               2016-01-17‘, 2016-01-18‘, 2016-01-19‘, 2016-01-20,
               2016-01-21‘, 2016-01-22‘, 2016-01-23‘, 2016-01-24,
               2016-01-25‘, 2016-01-26‘, 2016-01-27‘, 2016-01-28,
               2016-01-29‘, 2016-01-30‘, 2016-01-31],
              dtype=datetime64[ns]‘, freq=D)

In [46]: pd.date_range(12/18/2015‘,1/1/2016‘,freq = BM)
Out[46]: DatetimeIndex([2015-12-31‘], dtype=datetime64[ns]‘, freq=BM)

In [47]: pd.date_range(5/2/2015 12:12:12‘,periods = 5)
Out[47]:
DatetimeIndex([2015-05-02 12:12:12‘, 2015-05-03 12:12:12,
               2015-05-04 12:12:12‘, 2015-05-05 12:12:12,
               2015-05-06 12:12:12],
              dtype=datetime64[ns]‘, freq=D)

In [48]: pd.date_range(5/2/2015 12:12:12‘,periods = 5,normalize = True)
Out[48]:
DatetimeIndex([2015-05-02‘, 2015-05-03‘, 2015-05-04‘, 2015-05-05,
               2015-05-06],
              dtype=datetime64[ns]‘, freq=D‘)
2016-01-24‘, ‘2016-01-25‘, ‘2016-01-26‘, ‘2016-01-27‘, ‘2016-01-28‘, ‘2016-01-29‘, ‘2016-01-30‘, ‘2016-01-31‘], dtype=‘datetime64[ns]‘, freq=‘D‘) In [46]: pd.date_range(‘12/18/2015‘,‘1/1/2016‘,freq = ‘BM‘) Out[46]: DatetimeIndex([‘2015-12-31‘], dtype=‘datetime64[ns]‘, freq=‘BM‘) In [47]: pd.date_range(‘5/2/2015 12:12:12‘,periods = 5) Out[47]: DatetimeIndex([‘2015-05-02 12:12:12‘, ‘2015-05-03 12:12:12‘, ‘2015-05-04 12:12:12‘, ‘2015-05-05 12:12:12‘, ‘2015-05-06 12:12:12‘], dtype=‘datetime64[ns]‘, freq=‘D‘) In [48]: pd.date_range(‘5/2/2015 12:12:12‘,periods = 5,normalize = True) Out[48]: DatetimeIndex([‘2015-05-02‘, ‘2015-05-03‘, ‘2015-05-04‘, ‘2015-05-05‘, ‘2015-05-06‘], dtype=‘datetime64[ns]‘, freq=‘D‘)

3.重复索引的时间序列

In [31]: dates = pd.DatetimeIndex([1/1/2000,1/2/2000,1/2/2000,1/2/2000,1/3/2000])
In [32]: dup_ts = Series(np.arange(5),index = dates)
In [33]: dup_ts
Out[33]:
2000-01-01    0
2000-01-02    1
2000-01-02    2
2000-01-02    3
2000-01-03    4
dtype: int32

In [34]: dup_ts.index.is_unique
Out[34]: False

In [35]: dup_ts[1/2/2000]
Out[35]:
2000-01-02    1
2000-01-02    2
2000-01-02    3
dtype: int32

In [36]: grouped = dup_ts.groupby(level = 0)

In [37]: grouped
Out[37]: <pandas.core.groupby.SeriesGroupBy object at 0x00000000082FA5C0>

In [38]: grouped.count()
Out[38]:
2000-01-01    1
2000-01-02    3
2000-01-03    1
dtype: int64

4.固定频率采样

In [39]: dates = [datetime(2011,1,2),datetime(2011,1,5),datetime(2011,1,7),
    ...: datetime(2011,1,8),datetime(2011,1,10),datetime(2011,1,12)]

In [40]: ts = Series(np.random.randn(6),index = dates)
In [42]: ts.resample(D)
Out[42]: DatetimeIndexResampler [freq=<Day>, axis=0, closed=left, label=left, convention=start, base=0]

5.频率和日期偏移量

In [52]: from pandas.tseries.offsets import Hour,Minute

In [53]: hour = Hour()
In [54]: hour
Out[54]: <Hour>

In [55]: Hour(4)
Out[55]: <4 * Hours>

In [56]: pd.date_range(1/1/2016,1/2/2016,freq = 2h)
Out[56]:
DatetimeIndex([2016-01-01 00:00:00, 2016-01-01 02:00:00,
               2016-01-01 04:00:00, 2016-01-01 06:00:00,
               2016-01-01 08:00:00, 2016-01-01 10:00:00,
               2016-01-01 12:00:00, 2016-01-01 14:00:00,
               2016-01-01 16:00:00, 2016-01-01 18:00:00,
               2016-01-01 20:00:00, 2016-01-01 22:00:00,
               2016-01-02 00:00:00],
              dtype=datetime64[ns], freq=2H)

In [57]: Hour(1) + Minute(30)
Out[57]: <90 * Minutes>

In [58]: pd.date_range(1/1/2016,periods = 5,freq = 1h30min)
Out[58]:
DatetimeIndex([2016-01-01 00:00:00, 2016-01-01 01:30:00,
               2016-01-01 03:00:00, 2016-01-01 04:30:00,
               2016-01-01 06:00:00],
              dtype=datetime64[ns], freq=90T)

In [59]: pd.date_range(1/1/2016,3/1/2016,freq = WOM-3FRI)
Out[59]: DatetimeIndex([2016-01-15, 2016-02-19], dtype=datetime64[ns], freq=WOM-3FRI)

In [60]: pd.date_range(1/1/2016,9/1/2016,freq = WOM-3FRI)
Out[60]:
DatetimeIndex([2016-01-15, 2016-02-19, 2016-03-18, 2016-04-15,
               2016-05-20, 2016-06-17, 2016-07-15, 2016-08-19],
              dtype=datetime64[ns], freq=WOM-3FRI)

              
In [73]: from pandas.tseries.offsets import Hour,Minute,Day,MonthEnd

In [74]: now + Day(3)
Out[74]: Timestamp(2011-12-02 00:00:00)

In [75]: now + MonthEnd()
Out[75]: Timestamp(2011-11-30 00:00:00)

In [76]: now + MonthEnd(2)
Out[76]: Timestamp(2011-12-31 00:00:00)

In [77]: offset = MonthEnd()

In [78]: offset.rollforward(now)
Out[78]: Timestamp(2011-11-30 00:00:00)

In [79]: offset.rollback(now)
Out[79]: Timestamp(2011-10-31 00:00:00)

In [80]: ts = Series(np.random.randn(5),index = pd.date_range(1/15/2000,periods = 5,freq = 4d))

In [81]: ts
Out[81]:
2000-01-15   -0.518612
2000-01-19    0.749769
2000-01-23   -1.020916
2000-01-27   -1.164565
2000-01-31    0.695788
Freq: 4D, dtype: float64

In [82]: ts.groupby(offset.rollforward)
Out[82]: <pandas.core.groupby.SeriesGroupBy object at 0x0000000008495D68>

In [83]: ts.groupby(offset.rollforward).mean()
Out[83]:
2000-01-31   -0.251707
dtype: float64

In [84]: ts.resample(M,how = mean)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).mean()
  if __name__ == __main__:
Out[84]:
2000-01-31   -0.251707
Freq: M, dtype: float64  

6.沿时间轴前移或后移

In [61]: ts = Series(np.random.randn(4),index = pd.date_range(1/1/2016,periods = 4,freq = M))

In [62]: ts
Out[62]:
2016-01-31   -2.002437
2016-02-29   -1.000022
2016-03-31    1.442409
2016-04-30   -0.578137
Freq: M, dtype: float64

In [63]: ts.shift(2)
Out[63]:
2016-01-31         NaN
2016-02-29         NaN
2016-03-31   -2.002437
2016-04-30   -1.000022
Freq: M, dtype: float64

In [64]: ts.shift(-2)
Out[64]:
2016-01-31    1.442409
2016-02-29   -0.578137
2016-03-31         NaN
2016-04-30         NaN
Freq: M, dtype: float64

In [65]: ts / ts.shift(1)
Out[65]:
2016-01-31         NaN
2016-02-29    0.499402
2016-03-31   -1.442377
2016-04-30   -0.400814
Freq: M, dtype: float64

In [66]:  ts / ts.shift(1)-1
Out[66]:
2016-01-31         NaN
2016-02-29   -0.500598
2016-03-31   -2.442377
2016-04-30   -1.400814
Freq: M, dtype: float64

In [67]: ts.shift(2,freq = M)
Out[67]:
2016-03-31   -2.002437
2016-04-30   -1.000022
2016-05-31    1.442409
2016-06-30   -0.578137
Freq: M, dtype: float64

In [68]: ts.shift(3,freq = D)
Out[68]:
2016-02-03   -2.002437
2016-03-03   -1.000022
2016-04-03    1.442409
2016-05-03   -0.578137
dtype: float64

In [69]: ts.shift(1,freq = 3D)
Out[69]:
2016-02-03   -2.002437
2016-03-03   -1.000022
2016-04-03    1.442409
2016-05-03   -0.578137
dtype: float64

 时区与时期

1.本地化时间和转换时间

In [86]: rng = pd.date_range(3/9/2012 9:30,periods = 6,freq = D)

In [87]: ts = Series(np.random.randn(len(rng)),index = rng)

In [88]: ts
Out[88]:
2012-03-09 09:30:00    0.611651
2012-03-10 09:30:00   -0.343742
2012-03-11 09:30:00    0.082115
2012-03-12 09:30:00    0.560457
2012-03-13 09:30:00   -2.086978
2012-03-14 09:30:00    0.395750
Freq: D, dtype: float64

In [89]: ts_utc = ts.tz_localize(US/Pacific)

In [90]: ts_utc
Out[90]:
2012-03-09 09:30:00-08:00    0.611651
2012-03-10 09:30:00-08:00   -0.343742
2012-03-11 09:30:00-07:00    0.082115
2012-03-12 09:30:00-07:00    0.560457
2012-03-13 09:30:00-07:00   -2.086978
2012-03-14 09:30:00-07:00    0.395750
Freq: D, dtype: float64

In [91]: ts_utc.tz_convert(US/Eastern)
Out[91]:
2012-03-09 12:30:00-05:00    0.611651
2012-03-10 12:30:00-05:00   -0.343742
2012-03-11 12:30:00-04:00    0.082115
2012-03-12 12:30:00-04:00    0.560457
2012-03-13 12:30:00-04:00   -2.086978
2012-03-14 12:30:00-04:00    0.395750
Freq: D, dtype: float64

In [92]: ts1 = ts[:7].tz_localize(Europe/London)
In [93]: ts1
Out[93]:
2012-03-09 09:30:00+00:00    0.611651
2012-03-10 09:30:00+00:00   -0.343742
2012-03-11 09:30:00+00:00    0.082115
2012-03-12 09:30:00+00:00    0.560457
2012-03-13 09:30:00+00:00   -2.086978
2012-03-14 09:30:00+00:00    0.395750
Freq: D, dtype: float64


In [95]: ts2 = ts1[2:].tz_convert(Europe/Moscow)
In [96]: ts2
Out[96]:
2012-03-11 13:30:00+04:00    0.082115
2012-03-12 13:30:00+04:00    0.560457
2012-03-13 13:30:00+04:00   -2.086978
2012-03-14 13:30:00+04:00    0.395750
Freq: D, dtype: float64

In [97]: result = ts1 + ts2   #不同时区之间的运算
In [98]: result
Out[98]:
2012-03-09 09:30:00+00:00         NaN
2012-03-10 09:30:00+00:00         NaN
2012-03-11 09:30:00+00:00    0.164230
2012-03-12 09:30:00+00:00    1.120913
2012-03-13 09:30:00+00:00   -4.173957
2012-03-14 09:30:00+00:00    0.791499
Freq: D, dtype: float64

 2.时期运算

In [9]: p = pd.Period(2016,freq = A-DEC)
In [10]: p
Out[10]: Period(2016, A-DEC)

In [11]: p+5
Out[11]: Period(2021, A-DEC)

In [12]: rng = pd.Period(2015,freq = A-DEC) - p
In [13]: rng
Out[13]: -1L

In [14]: rng1 = pd.period_range(1/1/2000,6/30/2000,freq = M)
In [15]: rng1
Out[15]: PeriodIndex([2000-01, 2000-02, 2000-03, 2000-04, 2000-05, 2000-06], dtype=int64, freq=M)

In [16]: type(rng1)
Out[16]: pandas.tseries.period.PeriodIndex

In [18]: Series(np.random.randn(6),index = rng1)
Out[18]:
2000-01   -0.147543
2000-02    1.232261
2000-03    0.703814
2000-04    1.717671
2000-05    0.478153
2000-06   -0.291470
Freq: M, dtype: float64

In [19]: values = [2001Q3,2002Q2,2003Q1]

In [20]: pd.PeriodIndex(values,freq = Q-DEC)
Out[20]: PeriodIndex([2001Q3, 2002Q2, 2003Q1], dtype=int64, freq=Q-DEC)

3.时期的频率转换

In [21]: p = pd.Period(2007,freq = A-DEC)
In [22]: p
Out[22]: Period(2007, A-DEC)

In [23]: p.asfreq(M,how = start)
Out[23]: Period(2007-01, M)

In [24]: p.asfreq(M,how = end)
Out[24]: Period(2007-12, M)

In [25]: p = pd.Period(2007-08,M)
In [26]: p.asfreq(A-JUN)
Out[26]: Period(2008, A-JUN)

In [27]: rng = pd.period_range(2006,2009,freq = A-DEC)
In [28]: ts = Series(np.random.randn(len(rng)),index = rng)
In [29]: ts
Out[29]:
2006    0.415646
2007    0.206330
2008   -0.495015
2009   -0.665069
Freq: A-DEC, dtype: float64

In [30]: ts.asfreq(M,how = start)
Out[30]:
2006-01    0.415646
2007-01    0.206330
2008-01   -0.495015
2009-01   -0.665069
Freq: M, dtype: float64

In [31]: ts.asfreq(B,how = end)
Out[31]:
2006-12-29    0.415646
2007-12-31    0.206330
2008-12-31   -0.495015
2009-12-31   -0.665069
Freq: B, dtype: float64

#季度频率转换
In [32]: p = pd.Period(2012Q4,freq = Q-JAN)
In [33]: p
Out[33]: Period(2012Q4, Q-JAN)

In [34]: p.asfreq(D,start)
Out[34]: Period(2011-11-01, D)

In [35]: p.asfreq(D,end)
Out[35]: Period(2012-01-31, D)

In [36]: rng = pd.period_range(2011Q3,2012Q4,freq = Q-JAN)

In [37]: rng.to_timestamp()
Out[37]:
DatetimeIndex([2010-08-01, 2010-11-01, 2011-02-01, 2011-05-01,
               2011-08-01, 2011-11-01],
              dtype=datetime64[ns], freq=QS-NOV)

In [38]: new_rng = (rng.asfreq(B,e) - 1).asfreq(T,s) + 16 * 60

In [39]: new_rng
Out[39]:
PeriodIndex([2010-10-28 16:00, 2011-01-28 16:00, 2011-04-28 16:00,
             2011-07-28 16:00, 2011-10-28 16:00, 2012-01-30 16:00],
            dtype=int64, freq=T)

In [40]: new_rng.to_timestamp()
Out[40]:
DatetimeIndex([2010-10-28 16:00:00, 2011-01-28 16:00:00,
               2011-04-28 16:00:00, 2011-07-28 16:00:00,
               2011-10-28 16:00:00, 2012-01-30 16:00:00],
              dtype=datetime64[ns], freq=None)

4.时间戳对象转时期索引对象

In [41]: rng = pd.date_range(1/1/2015,periods = 3,freq = M)
In [42]: ts = Series(np.random.randn(3),index = rng)
In [43]: ts
Out[43]:
2015-01-31    0.529904
2015-02-28   -0.349043
2015-03-31    0.046308
Freq: M, dtype: float64

In [44]: ts.to_period()
Out[44]:
2015-01    0.529904
2015-02   -0.349043
2015-03    0.046308
Freq: M, dtype: float64

In [45]: rng = pd.date_range(1/29/2000,periods = 6,freq = D)
In [46]: ts2 = Series(np.random.randn(6),index = rng)
In [47]: ts2
Out[47]:
2000-01-29    1.462543
2000-01-30    0.486943
2000-01-31    0.477313
2000-02-01   -1.160804
2000-02-02    0.306688
2000-02-03    0.016622
Freq: D, dtype: float64

In [48]: ts2.to_period(M)
Out[48]:
2000-01    1.462543
2000-01    0.486943
2000-01    0.477313
2000-02   -1.160804
2000-02    0.306688
2000-02    0.016622
Freq: M, dtype: float64

In [49]: pts = ts2.to_period()
In [50]: pts
Out[50]:
2000-01-29    1.462543
2000-01-30    0.486943
2000-01-31    0.477313
2000-02-01   -1.160804
2000-02-02    0.306688
2000-02-03    0.016622
Freq: D, dtype: float64

In [51]: pts.to_timestamp(how = end)
Out[51]:
2000-01-29    1.462543
2000-01-30    0.486943
2000-01-31    0.477313
2000-02-01   -1.160804
2000-02-02    0.306688
2000-02-03    0.016622
Freq: D, dtype: float64

采样

1.重采样

#1.OHLC重采样
In [63]: rng = pd.date_range(1/1/2000,periods = 12,freq = T)
In [64]: ts = Series(np.random.randn(12),index = rng)
In [65]: ts
Out[65]:
2000-01-01 00:00:00   -0.975897
2000-01-01 00:01:00   -0.817074
2000-01-01 00:02:00   -0.438881
2000-01-01 00:03:00   -1.852057
2000-01-01 00:04:00    0.869463
2000-01-01 00:05:00    0.837448
2000-01-01 00:06:00    1.847643
2000-01-01 00:07:00    0.653615
2000-01-01 00:08:00    0.065392
2000-01-01 00:09:00    0.411093
2000-01-01 00:10:00   -1.184392
2000-01-01 00:11:00    0.523688
Freq: T, dtype: float64

In [66]: ts.resample(5min,how = ohlc)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).ohlc()
  if __name__ == __main__:
Out[66]:
                         open      high       low     close
2000-01-01 00:00:00 -0.975897  0.869463 -1.852057  0.869463
2000-01-01 00:05:00  0.837448  1.847643  0.065392  0.411093
2000-01-01 00:10:00 -1.184392  0.523688 -1.184392  0.523688

#2.通过groupby进行重采样
In [67]: rng = pd.date_range(1/1/2000,periods = 100,freq = D)
In [68]: ts = Series(np.arange(100),index = rng)

In [69]: ts.groupby(lambda x:x.month).mean()
Out[69]:
1    15
2    45
3    75
4    95
dtype: int32

In [70]: ts.groupby(lambda x:x.weekday).mean()
Out[70]:
0    47.5
1    48.5
2    49.5
3    50.5
4    51.5
5    49.0
6    50.0
dtype: float64

#对日期进行重采样
In [79]: annual_frame = frame.resample(‘A-DEC‘,how = ‘mean‘)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: how in .resample() is deprecated
the new syntax is .resample(...).mean()
  if __name__ == ‘__main__‘:

In [80]: annual_frame
Out[80]:
      Colorado     Texas  New York      Ohio
2000  0.031121  0.267223 -0.328301  0.592017
2001  0.441272 -0.115328  0.073894  0.094406

In [81]: annual_frame.resample(‘Q-DEC‘,fill_method = ‘ffill‘)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: fill_method is deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == ‘__main__‘:
Out[81]:
        Colorado     Texas  New York      Ohio
2000Q1  0.031121  0.267223 -0.328301  0.592017
2000Q2  0.031121  0.267223 -0.328301  0.592017
2000Q3  0.031121  0.267223 -0.328301  0.592017
2000Q4  0.031121  0.267223 -0.328301  0.592017
2001Q1  0.441272 -0.115328  0.073894  0.094406
2001Q2  0.441272 -0.115328  0.073894  0.094406
2001Q3  0.441272 -0.115328  0.073894  0.094406
2001Q4  0.441272 -0.115328  0.073894  0.094406

In [82]: annual_frame.resample(‘Q-DEC‘,fill_method = ‘ffill‘,convention = ‘start‘)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: fill_method is deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == ‘__main__‘:
Out[82]:
        Colorado     Texas  New York      Ohio
2000Q1  0.031121  0.267223 -0.328301  0.592017
2000Q2  0.031121  0.267223 -0.328301  0.592017
2000Q3  0.031121  0.267223 -0.328301  0.592017
2000Q4  0.031121  0.267223 -0.328301  0.592017
2001Q1  0.441272 -0.115328  0.073894  0.094406
2001Q2  0.441272 -0.115328  0.073894  0.094406
2001Q3  0.441272 -0.115328  0.073894  0.094406
2001Q4  0.441272 -0.115328  0.073894  0.094406

In [83]: annual_frame.resample(‘Q-MAR‘,fill_method = ‘ffill‘)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: fill_method is deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == ‘__main__‘:
Out[83]:
        Colorado     Texas  New York      Ohio
2000Q4  0.031121  0.267223 -0.328301  0.592017
2001Q1  0.031121  0.267223 -0.328301  0.592017
2001Q2  0.031121  0.267223 -0.328301  0.592017
2001Q3  0.031121  0.267223 -0.328301  0.592017
2001Q4  0.441272 -0.115328  0.073894  0.094406
2002Q1  0.441272 -0.115328  0.073894  0.094406
2002Q2  0.441272 -0.115328  0.073894  0.094406
2002Q3  0.441272 -0.115328  0.073894  0.094406

2.升采样和差值

In [71]: frame = DataFrame(np.random.randn(2,4),index = pd.date_range(1/1/2000,periods = 2,freq = W-WED),
    ...:     columns = [Colorado,Texas,New York,Ohio])

In [72]: frame
Out[72]:
            Colorado     Texas  New York      Ohio
2000-01-05 -0.391780  0.623187  2.168219 -0.434276
2000-01-12  0.611064  0.618274 -0.206151 -0.926855

In [73]: df_daily = frame.resample(D)

In [74]: df_daily
Out[74]: C:\Anaconda2\lib\site-packages\IPython\utils\dir2.py:65: FutureWarning: .resample() is now a deferred operation
use .resample(...).mean() instead of .resample(...)
  canary = getattr(obj, _ipython_canary_method_should_not_exist_, None)
DatetimeIndexResampler [freq=<Day>, axis=0, closed=left, label=left, convention=start, base=0]

In [75]: frame.resample(D,fill_method = ffill)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: fill_method is deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == __main__:
Out[75]:
            Colorado     Texas  New York      Ohio
2000-01-05 -0.391780  0.623187  2.168219 -0.434276
2000-01-06 -0.391780  0.623187  2.168219 -0.434276
2000-01-07 -0.391780  0.623187  2.168219 -0.434276
2000-01-08 -0.391780  0.623187  2.168219 -0.434276
2000-01-09 -0.391780  0.623187  2.168219 -0.434276
2000-01-10 -0.391780  0.623187  2.168219 -0.434276
2000-01-11 -0.391780  0.623187  2.168219 -0.434276
2000-01-12  0.611064  0.618274 -0.206151 -0.926855

In [76]: frame.resample(W-THU,fill_method = ffill)
C:/Anaconda2/Scripts/ipython-script.py:1: FutureWarning: fill_method is deprecated to .resample()
the new syntax is .resample(...).ffill()
  if __name__ == __main__:
Out[76]:
            Colorado     Texas  New York      Ohio
2000-01-06 -0.391780  0.623187  2.168219 -0.434276
2000-01-13  0.611064  0.618274 -0.206151 -0.926855

In [77]: frame = DataFrame(np.random.randn(24,4),index = pd.period_range(1-2000,12-2001,freq = M),
    ...:     columns = [Colorado,Texas,New York,Ohio])

In [78]: frame
Out[78]:
         Colorado     Texas  New York      Ohio
2000-01 -0.410764  0.493883  0.372263  1.292698
2000-02 -1.062080  0.918195 -0.724518  0.164564
2000-03 -0.043802  2.993207 -1.522635  0.838996
2000-04 -0.363853 -0.212628  0.528066  1.275305
2000-05  1.497088 -1.067684  0.092587  2.278649
2000-06  1.093441  0.807193 -2.299057  0.806335
2000-07  1.233143 -1.279697  1.340937 -0.293675
2000-08  0.361909  0.069654  0.431176 -0.126774
2000-09 -1.529141 -0.124773 -0.807565  1.108400
2000-10  0.259290 -0.493926 -1.511169  0.853348
2000-11 -0.941060  1.205524 -0.754967 -1.066688
2000-12  0.279276 -0.102266  0.915269 -0.026958
2001-01  0.238687 -0.057031  1.632795 -0.859731
2001-02  1.878275  0.344334  1.375966 -0.276001
2001-03 -0.694704 -0.566174  0.066509 -0.189110
2001-04 -0.080118 -0.182078 -0.356520 -0.458191
2001-05 -0.088858 -1.934695 -0.153724  0.347450
2001-06  1.742578  1.659385  0.031750  0.462085
2001-07  0.972973  0.797676 -0.561107 -0.200623
2001-08  0.628312  0.916874 -1.138119  1.766849
2001-09 -0.129747 -1.861520  0.523099 -1.124577
2001-10 -0.138502  0.644714 -1.045726  0.336395
2001-11  1.265029 -0.461168 -0.239620  0.835510
2001-12 -0.298662 -0.684251  0.751426  0.492821

 

时间序列

标签:日历   start   value   from   2-2   plot   nbsp   [1]   package   

原文地址:http://www.cnblogs.com/Ryana/p/7326978.html

(0)
(0)
   
举报
评论 一句话评论(0
登录后才能评论!
© 2014 mamicode.com 版权所有  联系我们:gaon5@hotmail.com
迷上了代码!