Introduction
-
Data mining [1], [2] focuses on extraction of information from a large set of data and transforms it into an easily interpretable structure for further use.
extraction:提取物;抽取。
Interpretable:可说明的;可判断的。数据挖掘[1],[2]侧重于从大量数据中提取信息,并将其转换为易于解释的结构以供进一步使用。
-
It is an interdisciplinary field focused on scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured. Mining interesting patterns from different types of data is quite important in many real-life applications [1], [3],[4], [5], [6].
interdisciplinary:各学科间的;跨学科的
它是一个跨学科的领域,侧重于科学方法、过程和系统,从各种形式的数据中提取知识或见解,无论是结构化的还是非结构化的。在许多实际应用程序中,从不同类型的数据中挖掘有趣的模式非常重要。
-
In recent decades, the task of interesting pattern mining [e.g.,frequent pattern mining(FPM) [7], [8],association rule mining(ARM) [9], [10],frequent episode mining(FEM)[11], [12], [13], [14], andsequential pattern mining(SPM) [5],[15], [16], [17]] has been extensively studied.
episode:情节;事件
在最近的几十年里,有趣的模式挖掘任务频繁模式挖掘(FPM)[7]、[8]、关联规则挖掘(ARM)[9]、[10]、频繁集挖掘(FEM)[11]、[12]、[13]、[14]、顺序模式挖掘(SPM)[5]、[15]、[16]、[17]等都得到了广泛的研究。
-
These are important and fundamental data mining techniques [1] that satisfy the re-quirements of real-world applications in numerous domains. Most of them aim at extracting the desired patterns using frequency or co-occurrence [7], [8], [9], [10], as well as other properties and interestingness measures [18], [19], [20], [21].
co-occurrence:共现
property : 性质这些都是重要的基础数据挖掘技术,满足了众多领域中真实应用的需求。大多数方法的目的是利用频率或共现[7],[8],[9],[10],以及其 他性质和兴趣度度量[18],[19],[20],[21]来提取所需的模式。
-
Despite the wide use of pattern mining techniques, most of these algorithms do not allow for the discovery of utility-oriented patterns, i.e., those that contribute the most to a predefined utility threshold, an objective function, or a performance metric.
utility-oriented:面向效用的
oriented:以…为导向的
performance metric:性能指标尽管模式挖掘技术得到了广泛的应用,但大多数算法都不允许发现面向效用的模式,即那些对预定义的效用阈值、目标函数或性能指标贡献最大的模式。
-
In general, some implicit factors, such as the utility, interestingness, or risk of objects/patterns, are commonly seen in real-world situations. The knowledge that is actually important to the user may not be found by traditional data mining algorithms. Therefore, a n