1.DataFrame两列直接相减
data['L'] = data['LOAD_TIME']-data['FFP_DATE']
在pandas中两列直接相减会出现如下错误:
TypeError: unsupported operand type(s) for -: 'str' and 'str'
2.使用to_datetime函数
不能直接相减,针对日期pandas有专门函数进行处理:
data['L'] = pd.to_datetime(data['LOAD_TIME'])-pd.to_datetime(data['FFP_DATE'])
'''
输出为:
L R F M C
0 2706 days 1 210 580717 0.961639
1 2597 days 7 140 293678 1.252314
2 2615 days 11 135 283712 1.254676
3 2047 days 97 23 281336 1.090870
4 1816 days 5 152 309928 0.970658
... ... ... ... ... ...
62974 3249 days 89 2 368 0.710000
62975 1961 days 121 2 368 0.670000
62976 1362 days 39 2 1062 0.225000
62977 466 days 464 2 904 0.250000
62978 1082 days 282 2 760 0.280000
[62044 rows x 5 columns]
--finish--
'''
3.去除结果中出现的days,精度到月,或者天数,小时等
如果向上面写,的确可以处理两列相减,但是结果中会有days,去除方法使用timedelta64函数:
'''精度到月'''
data['L'] = data['L'].map(lambda x: x/np.timedelta64(1,'M'))
'''
#或者是这样写,结果会细微差别
data['L'] = data['L'].map(lambda x: x/np.timedelta64(30,'D'))
'''
'''结果输出为
L R F M C
0 88.905316 1 210 580717 0.961639
1 85.324134 7 140 293678 1.252314
2 85.915522 11 135 283712 1.254676
3 67.253948 97 23 281336 1.090870
4 59.664469 5 152 309928 0.970658
... ... ... ... ... ...
62974 106.745518 89 2 368 0.710000
62975 64.428428 121 2 368 0.670000
62976 44.748352 39 2 1062 0.225000
62977 15.310376 464 2 904 0.250000
62978 35.548985 282 2 760 0.280000
[62044 rows x 5 columns]
'''
data['L'] = data['L'].map(lambda x: x/np.timedelta64(1,'D'))
'''
输出小时
data['L'] = data['L'].map(lambda x: x/np.timedelta64(1,'h'))
输出天数
L R F M C
0 2706.0 1 210 580717 0.961639
1 2597.0 7 140 293678 1.252314
2 2615.0 11 135 283712 1.254676
3 2047.0 97 23 281336 1.090870
4 1816.0 5 152 309928 0.970658
... ... ... ... ... ...
62974 3249.0 89 2 368 0.710000
62975 1961.0 121 2 368 0.670000
62976 1362.0 39 2 1062 0.225000
62977 466.0 464 2 904 0.250000
62978 1082.0 282 2 760 0.280000
[62044 rows x 5 columns]
'''
4.timedelta64理解
官方解释:Datetimes and Timedeltas work together to provide ways for simple datetime calculations.
timedelta64其实就是能将时间转换为是以年,月或者日,小时等为基础单位。在已有的日期时间上可以与之加减运算,较为方便,不需要自己将两个运算的时间转换一致。看几个例子:
np.datetime64('2009-01-01') - np.datetime64('2008-01-01')
np.datetime64('2009') + np.timedelta64(20, 'D')
np.datetime64('2011-06-15T00:00') + np.timedelta64(12, 'h')
np.timedelta64(1,'W') / np.timedelta64(1,'D')
'''
输出:
366 days
2009-01-21
2011-06-15T12:00
7.0
'''
timedelta64单位及含义:
'''
Here are the date units:
Code Meaning Time span (relative) Time span (absolute)
Y year +/- 9.2e18 years [9.2e18 BC, 9.2e18 AD]
M month +/- 7.6e17 years [7.6e17 BC, 7.6e17 AD]
W week +/- 1.7e17 years [1.7e17 BC, 1.7e17 AD]
D day +/- 2.5e16 years [2.5e16 BC, 2.5e16 AD]
And here are the time units:
Code Meaning Time span (relative) Time span (absolute)
h hour +/- 1.0e15 years [1.0e15 BC, 1.0e15 AD]
m minute +/- 1.7e13 years [1.7e13 BC, 1.7e13 AD]
s second +/- 2.9e11 years [2.9e11 BC, 2.9e11 AD]
ms millisecond +/- 2.9e8 years [ 2.9e8 BC, 2.9e8 AD]
us microsecond +/- 2.9e5 years [290301 BC, 294241 AD]
ns nanosecond +/- 292 years [ 1678 AD, 2262 AD]
ps picosecond +/- 106 days [ 1969 AD, 1970 AD]
fs femtosecond +/- 2.6 hours [ 1969 AD, 1970 AD]
as attosecond +/- 9.2 seconds [ 1969 AD, 1970 AD]
'''
详解请参阅官方文档:https://docs.scipy.org/doc/numpy/reference/arrays.datetime.html?highlight=timedelta64
优快云无法上传附件,上传的资源下载需要积分(这个系统固定的积分无法更改),所以源码及文件资源等我上传百度云后给出链接,或者积分下载
https://download.youkuaiyun.com/download/xiaoleng_o/11983268