[LeetCode]10. Regular Expression Matching正则表达式匹配

最新推荐文章于 2025-09-07 20:32:33 发布

转载最新推荐文章于 2025-09-07 20:32:33 发布 · 72 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/jchen104/p/10199823.html

文章标签：

#数据结构与算法

本文深入探讨了正则表达式匹配算法，包括使用DP和递归两种方法实现.和*的支持。详细解释了如何通过动态规划解决复杂匹配问题，并提供了代码示例。

Given an input string (s) and a pattern (p), implement regular expression matching with support for '.' and '*'.

'.' Matches any single character.
'*' Matches zero or more of the preceding element.

The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.

Example 1:

Input:
s = "aa"
p = "a"
Output: false
Explanation: "a" does not match the entire string "aa".

Example 2:

Input:
s = "aa"
p = "a*"
Output: true
Explanation: '*' means zero or more of the precedeng element, 'a'. Therefore, by repeating 'a' once, it becomes "aa".

Example 3:

Input:
s = "ab"
p = ".*"
Output: true
Explanation: ".*" means "zero or more (*) of any character (.)".

Example 4:

Input:
s = "aab"
p = "c*a*b"
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore it matches "aab".

Example 5:

Input:
s = "mississippi"
p = "mis*is*p*."
Output: false

题目要求判断给出的表达式p能否匹配字符串s，其中"."能匹配任意字符，"*"能匹配前一个字符任意多次，比如"a*"，可以匹配空字符，或者任意多个a，基于本题中
"*"可以是0个，所以不能由前面不匹配就得出后面也不匹配的结论，比如s=b,p=a*b，一开始a!=b，我们不能由这个结论得出整个字符串不匹配，所以这题我们需要把整个字符串匹配完。

这题我的做法是DP，假设dp[i][j]表示匹配到S[:i]与P[:j]
（1）p[j]=!'*'，此时p[j]可能是字符，那就需要S[i]==P[j]，可能是"."，肯定匹配
dp[i][j] = dp[i-1][j-1] && ( s[i]==p[j] || p[j]=='.' )

（2）P[j]=='*'，此时需要根据*的重复次数分情况讨论

重复0次，此时p[j]的作用就是消去前一个字符的作用，所以dp[i][j] = dp[i][j-2]
重复1次或以上，需要看p[j-1]的这个字符能否匹配S[i]这个位置的字符，

也就是S[i]==P[j-1]&&P[j-1]=='.'，再来看dp[i][j]是从哪一个状态转化来的，假设
S=abbbb，p=ab*，要求dp[1][1]时，此时dp[0][]已经全部得出来了，dp[1][1]=dp[0][1],
dp[0][1]到dp[1][1]是"*"从重复0次到1次的变化，然后是dp[2][2]=dp[1][2]，我们可以总结出，
dp[i][j]的状态是"*"重复次数增加带来的状态变化，变化前是dp[i-1][j]，也就是"*"重复次数增加前的状态。所以dp[i][j] = dp[i-1][j] && ( S[i]==P[j-1] || P[j-1]=='.' )

综上所述
dp[i][j] = dp[i-1][j-1] && ( s[i]==p[j] || p[j]=='.' ) ,条件是p[j]=!'*'
dp[i][j] = dp[i][j-2]，条件是P[j]=='*'，并且重复0次
dp[i][j] = dp[i-1][j] && ( S[i]==P[j-1] || P[j-1]=='.' )，条件是P[j]=='*'并重复不止1次

因为存在i-1,j-2这样的状态，为了防止数组越界，我们把S和P都向后移动一个位置，就当添加了一个空格" "进去，这样S和P的比较都是从1开始了，所有的判断条件需要多减一位

同时，简化"*"重复次数的不同状态，因为"*"重复0次不需要任何条件，dp[i][j] = dp[i][j - 2] || i > 0 && dp[i - 1][j] && (s.charAt(i - 1) == p.charAt(j - 2) || p.charAt(j - 2) == '.')

class Solution {
    public boolean isMatch(String s, String p) {
        int m = s.length(), n = p.length();
        boolean dp[][] = new boolean[m + 1][n + 1];
        dp[0][0] = true;
        for(int i = 0; i <= m; ++i) {
            for(int j = 1; j <= n; ++j) {
                if(p.charAt(j - 1) == '*') {
                    dp[i][j] = dp[i][j - 2] || i > 0 && dp[i - 1][j] && (s.charAt(i - 1) == p.charAt(j - 2) || p.charAt(j - 2) == '.');
                } else {
                    dp[i][j] = i > 0 && dp[i - 1][j - 1] && (s.charAt(i - 1) == p.charAt(j - 1) || p.charAt(j - 1) == '.');
                }
            }
        }
        return dp[m][n];
    }
}

还有其他方法，像递归求解

若p为空，若s也为空，返回true，反之返回false

若p的长度为1，若s长度也为1，且相同或是p为'.'则返回true，反之返回false

若p的第二个字符不为*，若此时s为空返回false，否则判断首字符是否匹配，且从各自的第二个字符开始调用递归函数匹配

若p的第二个字符为*，若s不为空且字符匹配，调用递归函数匹配s和去掉前两个字符的p，若匹配返回true，否则s去掉首字母

返回调用递归函数匹配s和去掉前两个字符的p的结果

class Solution {
    public boolean isMatch(String s, String p) {
        if (p.isEmpty()) return s.isEmpty();
        if (p.length() == 1) {
            return (s.length() == 1 && (s.charAt(0) == p.charAt(0)|| p.charAt(0) == '.'));
        }
        if (p.charAt(1)!= '*') {
            if (s.isEmpty()) return false;
            return (s.charAt(0) == p.charAt(0) || p.charAt(0) == '.') && isMatch(s.substring(1), p.substring(1));
        }
        while (!s.isEmpty() && (s.charAt(0) ==p.charAt(0) || p.charAt(0) == '.')) {
            if (isMatch(s, p.substring(2))) return true;
            s = s.substring(1);
        }
        return isMatch(s, p.substring(2));
    }
}