Hutool - DFA：基于 DFA 模型的多关键字查找

五行星辰

于 2025-02-21 21:00:17 发布

阅读量1.3k

点赞数 14

分类专栏：业务系统应用技术 hutool 工具箱文章标签： java 后端

本文链接：https://blog.youkuaiyun.com/wang543203/article/details/145785615

版权

一、简介

在文本处理中，常常需要在一段文本里查找多个关键字是否存在，例如敏感词过滤、关键词匹配等场景。Hutool - DFA 模块基于确定性有限自动机（Deterministic Finite Automaton，DFA）模型，为我们提供了高效的多关键字查找功能。DFA 模型是一种状态机，它通过预先构建一个状态转移表，能够在一次遍历文本的过程中，快速判断是否存在多个关键字，时间复杂度为 $O (n)$ ，其中 $n$ 是文本的长度，这使得它在处理大规模文本和大量关键字时具有很高的效率。

二、引入依赖

若使用 Maven 项目，在 pom.xml 中添加以下依赖：

<dependency>
    <groupId>cn.hutool</groupId>
    <artifactId>hutool-all</artifactId>
    <version>5.8.16</version>
</dependency>

如果是 Gradle 项目，在 build.gradle 中添加：

implementation 'cn.hutool:hutool-all:5.8.16'

三、基本使用步骤

1. 创建 DFA 匹配器

import cn.hutool.dfa.FoundWord;
import cn.hutool.dfa.WordTree;
import java.util.ArrayList;
import java.util.List;

public class DFAExample {
   
    public static void main(String[] args) {
   
        // 创建 WordTree 对象，用于构建 DFA 模型
        WordTree wordTree = new WordTree();
        // 添加关键字