时间序列数据在很多领域都是重要的结构化数据形式
1.python标准库中的datetime
1.datetime
from datetime import datetime
now = datetime.now()
print(now)
2019-11-09 09:58:45.827047
2.两个时间相减
delta = datetime.now()-datetime(2015,6,23)
print(delta)
1600 days, 9:58:45.827047
print(delta.days)
1600
print(delta.seconds)
35925
3.strfime():按照一定格式将datetime转换为字符串,
stamp = datetime.now()
print(str(stamp))
2019-11-09 14:37:20.217164
print(stamp.strftime('%Y-%m-%d'))
2019-11-09
4.strptime(字符串,转换格式):将字符串转换为日期,
value = '2015-08-08'
print(datetime.strptime(value,'%Y-%m-%d'))
2015-08-08 00:00:00
2.时间序列基础
1.将时间序列作为索引
dates = [datetime(2011,1,2),datetime(2011,1,2),datetime(2011,1,3)]
ts = Series(np.random.randn(3),index=dates)
print(ts)
2.data_range('日期',periods='')
longer = Series(np.random.randn(1000),index=pd.date_range('1/1/2000',periods=1000))
print(longer)
2000-01-01 -0.449904
2000-01-02 -0.513678
2000-01-03 2.371833
2000-01-04 -1.846488
2000-01-05 1.234716
...
2002-09-22 1.560118
2002-09-23 -0.380624
2002-09-24 -0.593029
2002-09-25 -0.142115
2002-09-26 0.627853
Freq: D, Length: 1000, dtype: float64
print(longer['2000']):字符串‘2000’被解释为一个年份,并选择相应的时间区间
2000-01-01 -0.063944
2000-01-02 -0.783158
2000-01-03 -0.513641
2000-01-04 -1.881692
2000-01-05 -0.773866
...
2000-12-27 -1.342624
2000-12-28 -0.321939
2000-12-29 -1.816298
2000-12-30 -0.312869
2000-12-31 -1.283336
Freq: D, Length: 366, dtype: float64
print(longer['2001-5'])
2001-05-01 0.385213
2001-05-02 0.673710
2001-05-03 0.574347
2001-05-04 0.888069
2001-05-05 -0.072358
2001-05-06 -0.094635
2001-05-07 -1.020520
2001-05-08 -0.624349
2001-05-09 -0.632038
2001-05-10 -0.534303
2001-05-11 -0.337141
2001-05-12 0.297579
2001-05-13 -0.091500
2001-05-14 0.223502
2001-05-15 -0.261659
2001-05-16 2.310869
2001-05-17 0.922134
2001-05-18 2.034180
2001-05-19 1.085959
2001-05-20 -0.198433
2001-05-21 -1.300043
2001-05-22 -0.837715
2001-05-23 -0.116093
2001-05-24 -0.800779
2001-05-25 0.274794
2001-05-26 -0.620030
2001-05-27 -0.050057
2001-05-28 -0.543447
2001-05-29 0.056641
2001-05-30 -0.913786
2001-05-31 -1.618820
Freq: D, dtype: float64
3.生成日期范围:默认情况下,data_range生成的是每月的时间戳
index = pd.data_range(start=,end=,periods=):当只传入一个起始或结尾日期,我们必须传递periods的值
index = pd.date_range('2019-01-01','2019-11-09')
print(index)
DatetimeIndex(['2019-01-01', '2019-01-02', '2019-01-03', '2019-01-04',
'2019-01-05', '2019-01-06', '2019-01-07', '2019-01-08',
'2019-01-09', '2019-01-10',
...
'2019-10-31', '2019-11-01', '2019-11-02', '2019-11-03',
'2019-11-04', '2019-11-05', '2019-11-06', '2019-11-07',
'2019-11-08', '2019-11-09'],
dtype='datetime64[ns]', length=313, freq='D')
4.data_range的freq参数:表明时间间隔的频率
index = pd.date_range('2019-01-01','2019-11-09',freq = '4h')
print(index)
DatetimeIndex(['2019-01-01 00:00:00', '2019-01-01 04:00:00',
'2019-01-01 08:00:00', '2019-01-01 12:00:00',
'2019-01-01 16:00:00', '2019-01-01 20:00:00',
'2019-01-02 00:00:00', '2019-01-02 04:00:00',
'2019-01-02 08:00:00', '2019-01-02 12:00:00',
...
'2019-11-07 12:00:00', '2019-11-07 16:00:00',
'2019-11-07 20:00:00', '2019-11-08 00:00:00',
'2019-11-08 04:00:00', '2019-11-08 08:00:00',
'2019-11-08 12:00:00', '2019-11-08 16:00:00',
'2019-11-08 20:00:00', '2019-11-09 00:00:00'],
dtype='datetime64[ns]', length=1873, freq='4H')