Distinct Subsequences

本文介绍了一种计算字符串S中不同子序列与目标字符串T相匹配数量的方法,并提供了详细的动态规划算法实现,包括优化后的空间复杂度解决方案。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Distinct Subsequences


Given a string S and a string T, count the number of distinct subsequences of T in S.

A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ACE" is a subsequence of "ABCDE" while "AEC" is not).

Here is an example:
S = "rabbbit"T = "rabbit"

Return 3

Java代码:

public class Solution {
    public int numDistinct(String S, String T) {
        int sl = S.length();
    int tl = T.length();

    int[] dp = new int[tl+1];
    dp[0] = 1;

    for(int s=1; s<=sl; s++)
        for(int t=tl; t>=1; t--){
            if(S.charAt(s-1)==T.charAt(t-1))
                dp[t] += dp[t-1];
        }

    return dp[tl];
    }
}

DP算法分析:

First of all, a bit clarification about the problem. The problem statement can be rephrased as

Given two sequences S, T, how many unique ways in sequence S, to form a subsequence that is identical to the sequence T.

e.g. 
    S = "rabbbit", T = "rabbit"

    The number is 3. And the formations as follows: 

    S1= "ra_bbit" S2= "rab_bit" S3="rabb_it"                            

    "_" marks the removed character. 

As a typical way to implement a dynamic programming algorithm, we construct a matrix dp, where each cell dp[i][j] represents the number of solutions of aligning substring T[0..i] with S[0..j];

Rule 1). dp[0][j] = 1, since aligning T = "" with any substring of S would have only ONE solution which is to delete all characters in S.

Rule 2). when i > 0, dp[i][j] can be derived by two cases:

case 1). if T[i] != S[j], then the solution would be to ignore the character S[j] and align substring T[0..i] with S[0..(j-1)]. Therefore, dp[i][j] = dp[i][j-1].

case 2). if T[i] == S[j], then first we could adopt the solution in case 1), but also we could match the characters T[i] and S[j] and align the rest of them (i.e. T[0..(i-1)] and S[0..(j-1)]. As a result,dp[i][j] = dp[i][j-1] + d[i-1][j-1]

e.g. T = B, S = ABC

dp[1][2]=1: Align T'=B and S'=AB, only one solution, which is to remove character A in S'.

public int numDistinct(String S, String T) {
    int sl = S.length();
    int tl = T.length();

    int [][] dp = new int[tl+1][sl+1];

    for(int i=0; i<=sl; ++i){
        dp[0][i] = 1;
    }

    for(int t=1; t<=tl; ++t){

        for(int s=1; s<=sl; ++s){
            if(T.charAt(t-1) != S.charAt(s-1)){
                dp[t][s] = dp[t][s-1];
            }else{
                dp[t][s] = dp[t][s-1] + dp[t-1][s-1];
            }
        }   
    }

    return dp[tl][sl];
}

As one can observe from the algorithm, in the inner loop, we only refer to the values in the previous iteration, i.e. the values in the previous row of the dp matrix. Therefore, to optimize the algorithm, we could reduce the space by keeping only two arrays, instead of the entire matrix. The algorithm can be optimized as follows. The running time is then reduced from 448ms to 424 ms.

public int numDistinct_sdp(String S, String T) {
    int sl = S.length();
    int tl = T.length();

    int [] preComb = new int[sl+1];
    int [] comb = new int[sl+1];


    for(int i=0; i<=sl; i++)
        preComb[i] = 1;     

    for(int t=1; t<=tl; ++t){
        for(int s=1; s<=sl; ++s){
            if(T.charAt(t-1) != S.charAt(s-1)){
                comb[s] = comb[s-1];
            }else{
                comb[s] = comb[s-1] + preComb[s-1];
            }
        }

        for(int i=0; i<=sl; ++i){
            preComb[i] = comb[i];
        }
    }

    return preComb[sl];
}
 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值