Leetcode: Wildcard Matching-优快云博客

本文链接：https://blog.youkuaiyun.com/phycept/article/details/45768087

本文探讨了多种解决正则表达式匹配问题的方法，包括递归、贪心和动态规划等算法，并通过实例展示了每种方法的特点及局限性。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Problem

Problem:
‘?’ Matches any single character.
‘*’ Matches any sequence of characters (including the empty sequence).

The matching should cover the entire input string (not partial).

The function prototype should be:
bool isMatch(const char *s, const char *p)

Some examples:
isMatch(“aa”,”a”) → false
isMatch(“aa”,”aa”) → true
isMatch(“aaa”,”aa”) → false
isMatch(“aa”, “*”) → true
isMatch(“aa”, “a*”) → true
isMatch(“ab”, “?*”) → true
isMatch(“aab”, “c*a*b”) → false

Bad Approach

After learning Regular Expressions, I thought it could be solved by Depth First Search, but then I find that this is not a RE problem.

Then I change my mind to write a recursion approach:

__author__ = 'phycept'

class Solution:
    def isMatch(self, s, p):
        return search(s, p, 0, 0)

def search(s,p,i,j):
    if i == len(s):
        if j == len(p):
            return True
        else:
            return False
    if j == len(p):
        return False
    if s[i] == p[j] or p[j] == '?':
        return search(s,p,i+1,j+1)
    if p[j] =='*':
        return search(s,p,i+1,j+1) or search(s,p,i+1,j)
    return False

if __name__ == '__main__':
    s = Solution()
    print(s.isMatch("aaa","aa?"))

This one worked but got a TLE at "aaabbbaabaaaaababaabaaabbabbbbbbbbaabababbabbbaaaaba", "a*******b", because the many ‘*’s there is, the much wider there the recursion go. Then another one did much better but still got a TLE at "abbaabbbbababaababababbabbbaaaabbbbaaabbbabaabbbbbabbbbabbabbaaabaaaabbbbbbaaabbabbbbababbbaaabbabbabb", "***b**a*a*b***b*a*b*bbb**baa*bba**b**bb***b*a*aab*a**":

__author__ = 'phycept'

class Solution:
    def isMatch(self, s, p):
        #return search(s, p, 0, 0)
        return search(s, 0, p, 0)

def find(s, i,pj):
    lst = []
    for ii in range(i,len(s)):
        if s[ii] == pj:
            lst.append(ii)
    return  lst

def search(s, i, p, j):
    if i == len(s):
        if j == len(p):
            return True
        while p[j] == '*':
            j += 1
            if j == len(p):
                return True
        return False
    if j == len(p):
        return False
    if p[j] == s[i] or p[j] == '?':
        return search(s, i+1, p, j + 1)
    if p[j] == '*':
        while p[j] == '*':
            j += 1
            if j == len(p):
                return True
        lst = find(s, i,p[j] )
        while lst:
            if(search(s,lst.pop() + 1,p,j+1)):
                return True
    return False

Greedy

(I assume it’s a greedy approach, cause I’m justing about to study it.: ))
After searching through the website, I got a elegant solution without recursion, here’s my code based on his thought and pass the OJ:

__author__ = 'phycept'

class Solution:
    def isMatch(self, s, p):
        return yucoding(s,p)
#reference: http://yucoding.blogspot.com/2013/02/leetcode-question-123-wildcard-matching.html
def yucoding(s,p):
    sposi = 0
    pposi = 0
    slst = 0
    star = -1
    while sposi < len(s):
        if pposi < len(p)and (p[pposi] == '?' or p[pposi] == s[sposi]):
            pposi += 1
            sposi += 1
        elif pposi < len(p)and  p[pposi] == '*':
            star = pposi
            pposi += 1
            slst = sposi
        elif star != -1:
            pposi = star+1
            slst += 1
            sposi = slst
        else:
            return False
    while pposi < len(p)and p[pposi] == '*':
        pposi += 1
    if pposi == len(p):
        return True
    else:
        return False

Basically it works because it just remembers the last position of asterisk to cut off many branch if there’s more asterisks. Refer to Zephyr’s comment for more details:

不是DP，这个解是个二叉树结构
if p == ‘’
isMatch(s, p) = isMatch(s, p + 1) || isMatch(s + 1, p + 1) || … || isMatch(s + n, p+1)
= isMatch(s, p + 1) || isMatch(s + 1, p)
else
只有一个分叉 = isMatch(s+1, p+1)

这个算法的关键是当左子树再遇到＊的时候，上次遇到＊分裂出来的右子树就不用搜索了。
例如：s = aab… p = *a*b…
aab…, *a*b…
aab…, a*b… ab…, *a*b…
ab…, *b…

第二次遇到＊的时候 s = ab… p = *b…
如果s和p不匹配，那么上次遇到＊的右子树ab…, *a*b…也肯定不匹配（可以用反证法来证明）。
如果匹配，搜索左子树就能找到结果。

假设ab…和*a*b…匹配，那么ab…和*b…肯定匹配，和条件相反。

Dynamic Programming Approach

Another way to solve this is using DP approach.

We want to find out whether p[:] matches s[:], then we need to find out whether p[:i] matched s[:j], until reaches the end. let’s set p[:i]matches s[:j] to dp[i][j]. There’s two situations:

p[i] != '*': then dp[i+1][j+1] == True if and only if dp[i][j] and (p[i] == s[j] or p[i] == '?'
p[i] == '*': then dp[i+1][j+1] equals to dp[i-1][j] or do[i][j-1].(You need to think about it.)

And because this row of dp is only depends on the row at the bottom, so we just need to store two of them( actually I think it’s only need one, but I haven’t implement it yet).

Another point we really need to think about is the start state. The code shows blow:

__author__ = 'phycept'

class Solution:
    def isMatch(self, s, p):
        #return search(s, p, 0, 0)
        return dp(s,p)

#reference: https://leetcode.com/discuss/22743/python-dp-solution
def dp(s,p):
    l = len(s)
    if len(p) - p.count('*') > l:
        return False
    dp = [True] + [False]*l
    for char in p:
        ndp = [ dp[0] and char == '*']
        if char == '*':
            for i in range(l):
                ndp += [ndp[-1] or dp[i + 1] ]
        elif char == '?':
            ndp += dp[:l]
        else:
            ndp += [ s[j] == char and dp[j] for j in range(l)]
        dp = ndp
    return dp[-1]