Pytorch框架：DTW一维数组实现

最新推荐文章于 2024-09-23 17:13:34 发布

转载最新推荐文章于 2024-09-23 17:13:34 发布 · 1.8k 阅读

7 ·

CC 4.0 BY-SA版权

原文链接：https://blog.youkuaiyun.com/yongjiankuang/article/details/102466615

文章标签：

#算法 #python #深度学习

调试记录专栏收录该内容

2 篇文章

订阅专栏

本文详细介绍了如何使用Pytorch库在Python中实现动态时间规整(DTW)算法，针对一维数组，通过一步步构建原始距离矩阵、动态计算最短距离并找到匹配路径，最终得到序列间的距离。

部署运行你感兴趣的模型镜像

Pytorch框架：DTW一维数组实现

1 背景

Dynamic Time Warping,动态时间规整算法
针对欧氏距离的计算问题，通过动态时间规整（Dynamic Time Warping，DTW）来评价时间序列的相似性。动态时间规整是一种把距离计算与时间规整相联系的非线性规整方法，时间规整是一个把时间序列拉长或压缩至一定长度的过程，其主要思路就是把一个复杂的全局最优问题变成若干个局部最优问题，逐步确定最优路径。在动态时间规整中，两个时间序列相同时间点上的值并不一定都是对应的，动态规整使两个时间序列中相似的点对应起来，例如最大值对应最大值，最小值对应最小值，位于下降过程中的点不能对应上升过程中的点。动态时间规整避免了数据不一致带来的误差，使得到的距离值更加合理。

2 步骤分析

2.1 导入库

需要用到array, zeros, argmin, inf。argmin()函数返回最小值的索引位置，链接: link；inf为无穷，链接: link。

from numpy import array, zeros, argmin, inf

2.2 创建一维数组

定义一维数组s1,s2，长度len_s1, len_s2；定义矩阵D0，D1，其中D1为D0矩阵去除第一行和第一列后剩下元素组成的矩阵。

s1 = [1, 2, 3, 4, 5, 5, 5, 4]
s2 = [3, 4, 5, 5, 5, 4]
len_s1, len_s2 = len(s1), len(s2)
D0 = zeros((len_s1+1,len_s2+1))
D0[0,1:] = inf
D0[1:,0] = inf
D1 = D0[1:,1:]
print('D0_before:\n',D0)
print('D1_before:\n',D1)

调试结果

D0_before:
 [[ 0. inf inf inf inf inf inf]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]
 [inf  0.  0.  0.  0.  0.  0.]]
D1_before:
 [[0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0. 0.]]

2.3 生成原始距离矩阵

原始距离矩阵可以手算，比较好理解。

for i in range(len_s1):
   for j in range(len_s2):
       D1[i,j] = abs(s1[i]-s2[j])
print('D0_after1:\n',D0)
print('D1_after1:\n',D1)

调试结果

D0_after1:
[[ 0. inf inf inf inf inf inf]
[inf  2.  3.  4.  4.  4.  3.]
[inf  1.  2.  3.  3.  3.  2.]
[inf  0.  1.  2.  2.  2.  1.]
[inf  1.  0.  1.  1.  1.  0.]
[inf  2.  1.  0.  0.  0.  1.]
[inf  2.  1.  0.  0.  0.  1.]
[inf  2.  1.  0.  0.  0.  1.]
[inf  1.  0.  1.  1.  1.  0.]]
D1_after1:
[[2. 3. 4. 4. 4. 3.]
[1. 2. 3. 3. 3. 2.]
[0. 1. 2. 2. 2. 1.]
[1. 0. 1. 1. 1. 0.]
[2. 1. 0. 0. 0. 1.]
[2. 1. 0. 0. 0. 1.]
[2. 1. 0. 0. 0. 1.]
[1. 0. 1. 1. 1. 0.]]

2.4 动态计算最短距离

矩阵M为D1的复制，注意在计算的时候，D1和D0是同时动态计算，可以调试自己查看这个计算过程。

M = D1.copy()
for i in range(len_s1):          #动态计算最短距离
   for j in range(len_s2):
       D1[i,j] += min(D0[i,j],D0[i,j+1],D0[i+1,j])
print('D0_after2:\n',D0)
print('D1_after2:\n',D1)

调试结果

D0_after2:
 [[ 0. inf inf inf inf inf inf]
 [inf  2.  5.  9. 13. 17. 20.]
 [inf  3.  4.  7. 10. 13. 15.]
 [inf  3.  4.  6.  8. 10. 11.]
 [inf  4.  3.  4.  5.  6.  6.]
 [inf  6.  4.  3.  3.  3.  4.]
 [inf  8.  5.  3.  3.  3.  4.]
 [inf 10.  6.  3.  3.  3.  4.]
 [inf 11.  6.  4.  4.  4.  3.]]
D1_after2:
 [[ 2.  5.  9. 13. 17. 20.]
 [ 3.  4.  7. 10. 13. 15.]
 [ 3.  4.  6.  8. 10. 11.]
 [ 4.  3.  4.  5.  6.  6.]
 [ 6.  4.  3.  3.  3.  4.]
 [ 8.  5.  3.  3.  3.  4.]
 [10.  6.  3.  3.  3.  4.]
 [11.  6.  4.  4.  4.  3.]]

2.5 定义路径

i和j取D0的行和列数-2，所以i为7行，j为5列。

i,j = array(D0.shape) - 2
print('i:',i,'\t','j:',j)
p,q = [i],[j]

调试结果

i: 7     j: 5

2.6 回溯寻找最短路径

python版本为python3，所以zip()函数的输出方式做了调整，若直接print((zip(p,q))，则会直接输出zip object at 0x000001F5D1731840，不方便对代码进行理解。

while(i>0 or j>0):          #回溯寻找最短路径
    tb = argmin((D0[i,j],D0[i,j+1],D0[i+1,j]))  #最小值的索引，argmin返回0，1，2
    if tb==0 :
        i-=1
        j-=1
    elif tb==1 :
        i-=1
    else:
        j-=1
    p.insert(0,i)
    q.insert(0,j)

print ('原始距离矩阵 M:\n',M)

print ('匹配路径过程 (p,q):')
for each in zip(p,q):
    print(each)

print ('累积距离矩阵 D1:\n',D1)
print ('序列距离 D1[-1,-1]:\n',D1[-1,-1])   #D1矩阵最后行最后列元素

调试结果

原始距离矩阵 M:
 [[2. 3. 4. 4. 4. 3.]
 [1. 2. 3. 3. 3. 2.]
 [0. 1. 2. 2. 2. 1.]
 [1. 0. 1. 1. 1. 0.]
 [2. 1. 0. 0. 0. 1.]
 [2. 1. 0. 0. 0. 1.]
 [2. 1. 0. 0. 0. 1.]
 [1. 0. 1. 1. 1. 0.]]
匹配路径过程 (p,q):
(0, 0)
(1, 0)
(2, 0)
(3, 1)
(4, 2)
(5, 3)
(6, 4)
(7, 5)
累积距离矩阵 D1:
 [[ 2.  5.  9. 13. 17. 20.]
 [ 3.  4.  7. 10. 13. 15.]
 [ 3.  4.  6.  8. 10. 11.]
 [ 4.  3.  4.  5.  6.  6.]
 [ 6.  4.  3.  3.  3.  4.]
 [ 8.  5.  3.  3.  3.  4.]
 [10.  6.  3.  3.  3.  4.]
 [11.  6.  4.  4.  4.  3.]]
序列距离 D1[-1,-1]:
 3.0

3 完整代码

from numpy import array, zeros, argmin, inf

s1 = [1, 2, 3, 4, 5, 5, 5, 4]
s2 = [3, 4, 5, 5, 5, 4]
 
len_s1, len_s2 = len(s1), len(s2)
D0 = zeros((len_s1+1,len_s2+1))
D0[0,1:] = inf              #inf表示无穷
D0[1:,0] = inf
D1 = D0[1:,1:]              #去除第一行和第一列剩下的矩阵
# print('D0_before:\n',D0)
# print('D1_before:\n',D1)

for i in range(len_s1):     #生成原始距离矩阵
    for j in range(len_s2):
        D1[i,j] = abs(s1[i]-s2[j])
# print('D0_after1:\n',D0)
# print('D1_after1:\n',D1)

M = D1.copy()
for i in range(len_s1):     #动态计算最短距离
    for j in range(len_s2):
        D1[i,j] += min(D0[i,j],D0[i,j+1],D0[i+1,j])
        #D1和D0同时动态计算
# print('D0_after2:\n',D0)
# print('D1_after2:\n',D1)

i,j = array(D0.shape) - 2
# print('i:',i,'\t','j:',j)

p,q = [i],[j]
while(i>0 or j>0):          #回溯寻找最短路径
    tb = argmin((D0[i,j],D0[i,j+1],D0[i+1,j]))
    #最小值的索引，argmin返回0，1，2
    if tb==0 :
        i-=1
        j-=1
    elif tb==1 :
        i-=1
    else:
        j-=1
    p.insert(0,i)
    q.insert(0,j)

print ('原始距离矩阵 M:\n',M)
print ('匹配路径过程 (p,q):')
for each in zip(p,q):
    print(each)
print ('累积距离矩阵 D1:\n',D1)
print ('序列距离 D1[-1,-1]:\n',D1[-1,-1])
#D1矩阵最后行最后列元素

运行结果

始距离矩阵 M:
 [[2. 3. 4. 4. 4. 3.]
 [1. 2. 3. 3. 3. 2.]
 [0. 1. 2. 2. 2. 1.]
 [1. 0. 1. 1. 1. 0.]
 [2. 1. 0. 0. 0. 1.]
 [2. 1. 0. 0. 0. 1.]
 [2. 1. 0. 0. 0. 1.]
 [1. 0. 1. 1. 1. 0.]]
匹配路径过程 (p,q):
(0, 0)
(1, 0)
(2, 0)
(3, 1)
(4, 2)
(5, 3)
(6, 4)
(7, 5)
累积距离矩阵 D1:
 [[ 2.  5.  9. 13. 17. 20.]
 [ 3.  4.  7. 10. 13. 15.]
 [ 3.  4.  6.  8. 10. 11.]
 [ 4.  3.  4.  5.  6.  6.]
 [ 6.  4.  3.  3.  3.  4.]
 [ 8.  5.  3.  3.  3.  4.]
 [10.  6.  3.  3.  3.  4.]
 [11.  6.  4.  4.  4.  3.]]
序列距离 D1[-1,-1]:
 3.0

4 运行环境

语言环境：python 3.6.10 64-bit，VScode，pytorch
计算机：Windows10

参考文献

代码部分摘自优快云博主奔跑熊，原博客链接：https://blog.youkuaiyun.com/yongjiankuang/article/details/102466615
博客园博主tornadomeet，原博客链接：https://www.cnblogs.com/tornadomeet/archive/2012/03/23/2413363.html
李正欣,张凤鸣,李克武,等.一种支持DTW距离的多元时间序列索引结构[J].软件学报,2014,25(03):560-575.
Petitjean F O , Ganarski P . Summarizing a set of time series by averaging[J]. Theoretical Computer ence, 2012.
Petitjean F O , Ganarski P . Dynamic Time Warping Averaging of Time Series Allows Faster and More Accurate Classification[J].(10.1109/ICDM.2014.27)
动态时间规划算法简介：https://zhuanlan.zhihu.com/p/43247215
动态时间规划算法简介2：https://blog.youkuaiyun.com/mystery_guest/article/details/82588286
一篇非常好的基于深度学习时间序列分类的综述：https://blog.youkuaiyun.com/qq_32796253/article/details/88538231
DTW论文解读（Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping），链接：https://zhuanlan.zhihu.com/p/87630320

您可能感兴趣的与本文相关的镜像

PyTorch 2.5

PyTorch

Cuda

PyTorch 是一个开源的 Python 机器学习库，基于 Torch 库，底层由 C++ 实现，应用于人工智能领域，如计算机视觉和自然语言处理