Boyer - Moore 字符串匹配算法与 C# 代码实现

CoreFMEA软件

于 2025-03-25 21:35:04 发布

阅读量588

点赞数 16

CC 4.0 BY-SA版权

分类专栏：技术算法文章标签：算法 c# 开发语言

本文链接：https://blog.youkuaiyun.com/CoreFMEA/article/details/146513393

技术算法专栏收录该内容

14 篇文章

订阅专栏

在这里插入图片描述

介绍

Boyer - Moore 算法是一种高效的字符串搜索算法，由 Robert S. Boyer 和 J Strother Moore 在 1977 年提出。该算法在搜索模式串时，利用了两种启发式规则，即坏字符规则（Bad Character Rule）和好后缀规则（Good Suffix Rule），以此来跳过尽可能多的字符，从而减少比较次数，提高搜索效率。

坏字符规则

当在文本串和模式串进行比较时，若出现不匹配的字符（即坏字符），可以根据这个坏字符在模式串中最后一次出现的位置，将模式串向右移动一定的位数。

好后缀规则

如果在模式串中已经有一部分后缀和文本串匹配成功（即好后缀），可以依据模式串中其他位置是否存在与该好后缀相同的子串，将模式串向右移动合适的位数。

在实际应用中，Boyer - Moore 算法会同时考虑坏字符规则和好后缀规则，取两者移动位数的最大值来移动模式串。

C# 代码实现

下面是使用 C# 实现 Boyer - Moore 字符串匹配算法的代码：

using System;

public class BoyerMoore
{
    private const int NO_OF_CHARS = 256;

    private static void BadCharHeuristic(string str, int size, int[] badchar)
    {
        for (int i = 0; i < NO_OF_CHARS; i++)
            badchar[i] = -1;

        for (int i = 0; i < size; i++)
            badchar[(int)str[i]] = i;
    }

    public static void Search(string txt, string pat)
    {
        int m = pat.Length;
        int n = txt.Length;

        int[] badchar = new int[NO_OF_CHARS];

        BadCharHeuristic(pat, m, badchar);

        int s = 0; 
        while (s <= (n - m))
        {
            int j = m - 1;

            while (j >= 0 && pat[j] == txt[s + j])
                j--;

            if (j < 0)
            {
                Console.WriteLine("Pattern found at index: " + s);

                s += (s + m < n) ? m - badchar[txt[s + m]] : 1;
            }
            else
            {
                s += Math.Max(1, j - badchar[txt[s + j]]);
            }
        }
    }
}

你可以使用以下方式调用这个算法：

class Program
{
    static void Main()
    {
        string txt = "ABABDABACDABABCABAB";
        string pat = "ABABCABAB";
        BoyerMoore.Search(txt, pat);
    }
}