1. 状态方程
令f(word1[:i],word2[:j])f(word1[:i],word2[:j])f(word1[:i],word2[:j])表示为:word2[:j]可由word1[:i]最少经过f(word1[:i],word2[:j])f(word1[:i],word2[:j])f(word1[:i],word2[:j])次insert、delete和replace操作后得到。则原问题转换成:
求解函数:f(word1[:n],word2[:m]) 求解函数:f(word1[:n],word2[:m]) 求解函数:f(word1[:n],word2[:m])
则:
令f(word1[:i],word2[:j])为S[i][j]S[i][j]=min(S[i][j−1]+1,S[i−1][j]+1,S[i−1][j−1]+word1[i]==word2[j]?0:1) 令f(word1[:i],word2[:j]) 为S[i][j] \\ S[i][j] = min\Big(S[i][j-1]+1,S[i-1][j]+1,S[i-1][j-1]+word1[i]==word2[j]?0:1\Big) 令f(word1[:i],word2[:j])为S[i][j]S[i][j]=min(S[i][j−1]+1,S[i−1][j]+1,S[i−1][j−1]+word1[i]==word2[j]?0:1)
- S[i][j]S[i][j]S[i][j]表示word1[:i]word1[:i]word1[:i]转换成word2[:j]word2[:j]word2[:j]的最少次数;
- 当insert时,i<j。因为 word1[:i]word1[:i]word1[:i]需要S[i][j−1]S[i][j-1]S[i][j−1]次转换成word2[:j−1]word2[:j-1]word2[:j−1],则word1[:i]word1[:i]word1[:i]只需要S[i][j−1]+1S[i][j-1]+1S[i][j−1]+1(多了次插入)次转换成word2[:j]word2[:j]word2[:j];
- 当delete时,说明word1[:i−1]word1[:i-1]word1[:i−1]可以以更少的次数转换成word2[:j]word2[:j]word2[:j]。可以直接delete word1[i]word1[i]word1[i],转换次数为S[i−1][j]+1S[i-1][j]+1S[i−1][j]+1;
- 当replace时,word1[i]!=word2[j]word1[i]!=word2[j]word1[i]!=word2[j]。转换次数为S[i−1][j−1]+1S[i-1][j-1]+1S[i−1][j−1]+1;
- 当word1[i]==word2[j]word1[i]==word2[j]word1[i]==word2[j]时,则不需要任何操作。转换次数为S[i−1][j−1]S[i-1][j-1]S[i−1][j−1]。
2. 代码
class Solution {
public:
int minDistance(string word1, string word2) {
vector<vector<int> > s(word2.size()+1,vector<int>(word1.size()+1));
int i,j,ins,del,rep,a,b;
for(i=0;i<word2.size()+1;++i){
s[i][0] = i; // word2拿出i个字符,word1需要修改i次,即插入i次
}
for(j=0;j<word1.size()+1;++j){
s[0][j] = j; // word2拿出0个字符,word1需要修改j次,即删除j次
}
for(a=1;a<word2.size()+1;++a){
for(b=1;b<word1.size()+1;++b){
i = a-1; // word2字符所在位置
j = b-1; // word1字符所在位置
ins = s[a-1][b]+1;
del = s[a][b-1]+1;
rep = (s[a-1][b-1] + (word2[i]==word1[j]?0:1)); // 不等时需要replace
s[a][b] = min(min(ins,del),rep);
}
}
return s[word2.size()][word1.size()];
}
};
本文深入解析编辑距离算法,一种衡量两个字符串相似度的方法。通过动态规划实现,文章详细阐述了状态方程的推导过程,以及如何通过最少的插入、删除和替换操作将一个字符串转换为另一个字符串。代码示例使用C++实现,展示了如何构建状态矩阵并求解最小编辑距离。
1230

被折叠的 条评论
为什么被折叠?



