【cookbook-pandas】学习笔记 Time Series Analysis

本文介绍了使用Python pandas库进行时间序列分析的方法,包括数据读取、日期时间格式转换、索引设置及时间切片等操作,并展示了如何按不同粒度对数据进行分组聚合。

由于机器故障,我辛辛苦苦抄半下午的都木有了……然而,笔记还是要做的。

chapter7 Time Series Analysis

understanding the difference between Python and pandas date tools

  • 关于参数error:
    在这里插入图片描述
    在这里插入图片描述
plate_time=pd.read_csv('eight_attri.csv',
                  usecols=['plateNumber','passCarTime'],
                 encoding='utf-8_sig',
                 #dtype={"jncCode":"category","deviceCode":"category"},
                 iterator=True
                #,delimiter="\t"
                  )
df=plate_time.get_chunk(2000)
df.dtypes
Out[7]: 
plateNumber    object
passCarTime    object
dtype: object
df.passCarTime=pd.to_datetime(df.passCarTime)
df.dtypes
Out[9]: 
plateNumber            object
passCarTime    datetime64[ns]
dtype: object
# set the 'passCarTime' column as the index to make intelligent Timestamp slicing possible
df=df.set_index('passCarTime')
df
Out[12]: 
                    plateNumber
passCarTime                    
2020-12-20 00:00:00     鄂A2J8C0
2020-12-20 00:00:00     鄂KX0175
2020-12-20 00:00:00     鄂A3K89F
2020-12-20 00:00:00     鄂H1B196
2020-12-20 00:00:00     鄂H1B196
                         ...
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:13     鄂KX0621
2020-12-20 00:05:13     鄂A39B0Y
[2000 rows x 1 columns]
# select all the rows equals to a single inedx by passing that value to the .loc attribute
crime.loc['2020-12-20 00:01:00']
Traceback (most recent call last):
  File "D:\PyCharm2020\python2020\lib\site-packages\IPython\core\interactiveshell.py", line 3427, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-14-0d1fff716a6f>", line 1, in <module>
    crime.loc['2020-12-20 00:01:00']
NameError: name 'crime' is not defined
df.loc['2020-12-20 00:01:00']
Out[15]: 
                    plateNumber
passCarTime                    
2020-12-20 00:01:00     鄂A289BS
2020-12-20 00:01:00     鄂A289BS
2020-12-20 00:01:00     鄂KX0579
2020-12-20 00:01:00     鄂A754S2
# select all the rows that partially match an index value
# e.g. we want all the record from Dec 20,2020
df.loc['2020-12-20']
Out[18]: 
                    plateNumber
passCarTime                    
2020-12-20 00:00:00     鄂A2J8C0
2020-12-20 00:00:00     鄂KX0175
2020-12-20 00:00:00     鄂A3K89F
2020-12-20 00:00:00     鄂H1B196
2020-12-20 00:00:00     鄂H1B196
                         ...
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:12     鄂AV2G25
2020-12-20 00:05:13     鄂KX0621
2020-12-20 00:05:13     鄂A39B0Y
[2000 rows x 1 columns]
# you also can do so for an entire month
df.loc['2020-12'].shape
Out[20]: (2000, 1)
# the selection strings may also contain the name of the month 
df.loc['Dec 2020'].sort_index()
Out[22]: 
                    plateNumber
passCarTime                    
2020-12-20 00:00:00     鄂A2J8C0
2020-12-20 00:00:00     鄂KX0175
2020-12-20 00:00:00     鄂A3K89F
2020-12-20 00:00:00     鄂H1B196
2020-12-
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值