Distinct Subsequences
Given a string S and a string T, count the number of distinct subsequences of T in S.
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE"
is
a subsequence of "ABCDE"
while "AEC"
is
not).
Here is an example:
S = "rabbbit"
, T = "rabbit"
Return 3
public class Solution {
public int numDistinct(String S, String T) {
int sl = S.length();
int tl = T.length();
int[] dp = new int[tl+1];
dp[0] = 1;
for(int s=1; s<=sl; s++)
for(int t=tl; t>=1; t--){
if(S.charAt(s-1)==T.charAt(t-1))
dp[t] += dp[t-1];
}
return dp[tl];
}
}
DP算法分析:
public class Solution {
public int numDistinct(String S, String T) {
int sl = S.length();
int tl = T.length();
int[] dp = new int[tl+1];
dp[0] = 1;
for(int s=1; s<=sl; s++)
for(int t=tl; t>=1; t--){
if(S.charAt(s-1)==T.charAt(t-1))
dp[t] += dp[t-1];
}
return dp[tl];
}
}
First of all, a bit clarification about the problem. The problem statement can be rephrased as
Given two sequences S, T, how many unique ways in sequence S, to form a subsequence that is identical to the sequence T.
e.g.
S = "rabbbit", T = "rabbit"
The number is 3. And the formations as follows:
S1= "ra_bbit" S2= "rab_bit" S3="rabb_it"
"_" marks the removed character.
As a typical way to implement a dynamic programming algorithm, we construct a matrix dp, where each cell dp[i][j]
represents
the number of solutions of aligning substring T[0..i] with S[0..j];
Rule 1). dp[0][j] = 1
, since aligning T = "" with any substring of S would have
only ONE solution which is to delete all characters in S.
Rule 2). when i > 0, dp[i][j] can be derived by two cases:
case 1). if T[i] != S[j], then the solution would be to ignore the character S[j] and align substring T[0..i] with S[0..(j-1)]. Therefore, dp[i][j]
= dp[i][j-1].
case 2). if T[i] == S[j], then first we could adopt the solution in case 1), but also we could match the characters T[i] and S[j] and align the rest of them (i.e. T[0..(i-1)] and S[0..(j-1)]. As a result,dp[i][j]
= dp[i][j-1] + d[i-1][j-1]
e.g. T = B, S = ABC
dp[1][2]=1: Align T'=B and S'=AB, only one solution, which is to remove character A in S'.
public int numDistinct(String S, String T) {
int sl = S.length();
int tl = T.length();
int [][] dp = new int[tl+1][sl+1];
for(int i=0; i<=sl; ++i){
dp[0][i] = 1;
}
for(int t=1; t<=tl; ++t){
for(int s=1; s<=sl; ++s){
if(T.charAt(t-1) != S.charAt(s-1)){
dp[t][s] = dp[t][s-1];
}else{
dp[t][s] = dp[t][s-1] + dp[t-1][s-1];
}
}
}
return dp[tl][sl];
}
As one can observe from the algorithm, in the inner loop, we only refer to the values in the previous iteration, i.e. the values in the previous row of the dp matrix. Therefore, to optimize the algorithm, we could reduce the space by keeping only two arrays, instead of the entire matrix. The algorithm can be optimized as follows. The running time is then reduced from 448ms to 424 ms.
public int numDistinct_sdp(String S, String T) {
int sl = S.length();
int tl = T.length();
int [] preComb = new int[sl+1];
int [] comb = new int[sl+1];
for(int i=0; i<=sl; i++)
preComb[i] = 1;
for(int t=1; t<=tl; ++t){
for(int s=1; s<=sl; ++s){
if(T.charAt(t-1) != S.charAt(s-1)){
comb[s] = comb[s-1];
}else{
comb[s] = comb[s-1] + preComb[s-1];
}
}
for(int i=0; i<=sl; ++i){
preComb[i] = comb[i];
}
}
return preComb[sl];
}