1. Introduction
During the last decade workflow management technology [2,4,21,35,41] has become readily available. Workflow management systems such as Staffware, IBM MQSeries, COSA, etc. offer generic modeling and enactment capabilities for structured business processes. By making process definitions, i.e., models describing the life-cycle of a typical case (workflow instance) in isolation, one can configure these systems to support business processes. These process definitions need to be executable and are typically graphical. Besides pure workflow management systems many other software systems have adopted workflow technology. Consider for example Enterprise Resource Planning (ERP) systems such as SAP, PeopleSoft, Baan and Oracle, Customer Relationship Management (CRM) software, Supply Chain Management (SCM) systems, Business to Business (B2B) applications, etc. which embed workflow technology. Despite its promise, many problems are encountered when applying workflow technology. One of the problems is that these systems require a workflow design, i.e., a designer has to construct a detailed model accurately describing the routing of work. Modeling a workflow is far from trivial: It requires deep knowledge of the business process at hand (i.e., lengthy discussions with the workers and management are needed) and the workflow language being used.
1. 引言
近十年,工作流管理技术[2,4,21,35,41]得到了普遍的应用。工作流管理系统(如:Staffware,IBM MQSeries, COSA等)为结构化的商务过程提供通用的建模和设定功能。通过过程定义,即独立描述一个典型案例(或称工作流实例)生存期的模型,我们能够配置这些系统以支持商务过程。这些过程定义通常都要以图形化的方式执行。除了那些单纯的工作流管理系统,许多其他的软件系统也采用了工作流技术。以ERP系统为例,如SAP、PeopleSoft、Baan和Oracle,CRM软件、SCM系统、B2B应用等等都内嵌了工作流技术。就其实际效果而言,在应用工作流技术方面还存在许多问题。问题之一,这些系统都要求(事先的)工作流设计,即一个(工作流)设计者必须构造一个详细的模型以准确的描述工作的路径(过程,顺序)。构建一个工作流并非那么简单:它要求具有对现实的商务过程有深入的理解(即:与工作人员和管理者进行深入的讨论时必要的),并且还要用到工作流语言。
To compare workflow mining with the traditional approach towards workflow design and enactment, consider the workflow life cycle shown in Fig. 1. The workflow life cycle consists of four phases: (A) workflow design, (B) workflow configuration, (C) workflow enactment, and (D) workflow diagnosis. In the traditional approach the design phase is used for constructing a workflow model. This is typically done by a business consultant and is driven by ideas of management on improving the business processes at hand. If the design is finished, the workflow system (or any other system that is “process aware”) is configured as specified in the design phase. In the configuration phases one has to deal with limitation and particularities of the workflow management system being used (cf. [5,65]). In the enactment phase, cases (i.e., workflow instances) are handled by the workflow system as specified in the design phase and realized in the configuration phase. Based on a running workflow, it is possible to collect diagnostic information which is analyzed in the diagnosis phase. The diagnosis phase can again provide input for the design phase thus completing the workflow life cycle. In the traditional approach the focus is on the design and configuration phases. Less attention is paid to the enactment phase and few organizations systematically collect runtime data which is analyzed as input for redesign (i.e., the diagnosis phase is typically missing).
与传统的工作流设计和实现方法相比,图1所示工作流挖掘的生存期包含4个阶段:(A)工作流设计,(B)工作流配置,(C)工作流实现,和(D)工作流诊断(分析)。传统方法下,设计阶段是为了构建一个工作流模型。这也是一个商业顾问典型的做法,并且被管理者(急于)改进现有商务过程的想法所左右着。一旦设计完成,工作流系统(或者其他被称之为“过程提示”的系统)就被配置为像设计阶段定义的那样了。在配置阶段,系统使用者就得接受工作流系统所设定的限制和特性[5,65]。就像在设计阶段定义的和在配置阶段实现的一样,在实现阶段,工作流系统掌握(记录)了若干案例(即,工作流实例)。在诊断阶段,基于一个运作中的工作流去收集那些用于分析的诊断信息是可能的。传统的工作流方法主要集中在设计和配置阶段,很少有人关注实现阶段,几乎没有哪个组织系统的收集运行时的数据以便分析并作为再设计的基础(即,诊断阶段基本上是没有的)。
Fig. 1. The workflow life-cycle is used to illustrate workflow mining and Delta analysis in relation to traditional workflow design.
图1 工作流生存期-用于演示工作流挖掘和与传统工作流设计相关的△分析
The goal of workflow mining is to reverse the process and collect data at runtime to support workflow design and analysis. Note that in most cases, prior to the deployment of a workflow system, the workflow was already there. Also note that in most information systems transactional data is registered (consider for example the transaction logs of ERP systems like SAP). The information collected at run-time can be used to derive a model explaining the events recorded. Such a model can be used in both the diagnosis phase and the (re)design phase.
工作流挖掘的目标是通过颠倒工作流过程和收集运行期数据以支持工作流设计和分析。注意:大多数情况下,在工作流系统发布之前,工作流就已经存在了。而且,在许多信息系统中处理信息都要被登记(例如像SAP一样的ERP系统所记录的处理日志)。这些在运行期被搜集的信息可以用来辅助一个模型来解释所记录的事件。这样的模型可以用于诊断阶段和(再)设计阶段。
Modeling an existing process is influenced by perceptions, e.g., models are often normative in the sense that they state what “should” be done rather than describing the actual process. As a result models tend to be rather subjective. A more objective way of modeling is to use data related to the actual events that took place. Note that workflow mining is not biased by perceptions or normative behavior. However, if people bypass the system doing things differently, the log can still deviate from the actual work being done. Nevertheless, it is useful to confront man-made models with models discovered through workflow mining.
为已有的处理过程构造模型与对其的理解有关,举例来讲,而跟实际所发生的不一样,模型总是会在一定程度上被标准化的描述为“应该”做什么。这样就导致模型趋向于主观。较为客观的做法是通过与实际发生的事件相关的数据建模。注意:工作流挖掘并不侧重于主观的理解或标准化的行为。然而,如果人们通过(工作流)系统做一些不同的事情,日志会独立于一直在做的实际工作。不过,有效的方法是在人造的模型与通过工作流挖掘发现的模型之间作比较。
Closely monitoring the events taking place at runtime also enables Delta analysis, i.e., detecting discrepancies between the design constructed in the design phase and the actual execution registered in the enactment phase. Workflow mining results in an “a posteriori” process model which can be compared with the “a priori” model. Workflow technology is moving into the direction of more operational flexibility to deal with workflow evolution and workflow exception handling[2,7,10,13,20,30,39,40,64]. As a result workers can deviate from the pre-specified workflow design. Clearly one wants to monitor these deviations. For example, a deviation may become common practice rather than being a rare exception. In such a case, the added value of a workflow system becomes questionable and an adaptation is required. Clearly, workflow mining techniques can be used to create a feedback loop to adapt the workflow model to changing circumstances and detect imperfections of the design.
深入的追踪那些运行时发生的时间有助于△分析,即检测在设计阶段所定义的内容和在实现阶段的实际执行情况之间的差异。工作流挖掘旨在构建一个用于与“前期”模型相对比的“后期”模型。工作流技术正在朝着灵活操作的方向发展,以适应工作流变化和处理工作流异常[2,7,10,13,20,30,39,40,64]。结果,使用者(工作人员)脱离了预定义的工作流设计。显而易见,我们需要监控这些现象。例如,这些“背离”现象可能已经成为了普通的活动而非偶尔的异常。在这种情况下,工作流系统的附加值将被质疑,而且要求有适应性的变化。显然,工作流挖掘技术可以用于生成“反馈环路”以调整工作流模型去适应变化的环境,并且检测设计的不足之处。
The topic of workflow mining is related to management trends such as Business Process Reengineering (BPR), Business Intelligence (BI), Business Process Analysis (BPA), Continuous Process Improvement (CPI), and Knowledge Management (KM). Workflow mining can be seen as part of the BI, BPA, and KM trends. Moreover, workflow mining can be used as input for BPR and CPI activities. Note that workflow mining seems to be more appropriate for BPR than for CPI. Recall that one of the basic elements of BPR is that it is radical and should not be restricted by the existing situation [23]. Also note that workflow mining is not a tool to (re)design processes. The goal is to understand what is really going on as indicated in Fig. 1. Despite the fact that workflow mining is not a tool for designing processes, it is evident that a good understanding of the existing processes is vital for any redesign effort.
工作流挖掘的主题是与诸如BPR(商务过程再造)、BI(商业智能)、BPA(商务过程分析)、CPI(持续过程改进)和KM(知识管理)这样的管理方向密切相关的。工作流挖掘可以被看作是BI、BPA和KM方向的一部分。进一步的,工作流挖掘可以作为BPR和CPI活动的输入。值得一提的是,工作流挖掘更适合BPR而非CPI。作为BPR的基本要素之一,它(工作流挖掘)是基本的,而且不受已有的条件所限制。还有,工作流挖掘不是一个(再)设计过程的工具。其目标是去了解在图1所示的过程中真正发生了些什么。尽管工作流挖掘不能作为设计过程的工具,但显然,对实际过程有个好的(详细的、深入的)了解对于任何再设计活动都是至关重要的。
This paper is a joint effort of a number of researchers using different approaches to workflow mining and is a spin-off of the “Workflow Mining Workshop”.[1] The goal of this paper is to introduce the concept of workflow mining, to identify scientific and practical problems, to present a common format to store workflow logs, to provide an overview of existing approaches, and to present a number of mining techniques in more detail.
这篇论文是一批使用不同方法进行工作流挖掘的研究者们集体努力的结果,也是“工作流挖掘工作室”1的成果。论文的目的是介绍工作流挖掘的概念,识别科研和实践中存在的问题,提出一种通用的存储工作流日志的格式,回顾已有的方法,并且详细的展现一系列的挖掘技术。
The remainder of this paper is organized as follows. First, we summarize related work. In Section 3 we define workflow mining and present some of the challenging problems. In Section 4 we propose a common XML-based format for storing and exchanging workflow logs. This format is used by the mining tools developed by the authors and interfaces with some of the leading workflow management systems (Staffware, MQSeries Workflow, and InConcert). Sections 5–9 introduce five approaches to workflow mining focusing on different aspects. These sections give an overview of some of the ongoing work on workflow mining. Section 10 compares the various approaches and list a number of open problems. Section 11 concludes the paper.
下面就这篇论文的组织结构给出说明。首先,我们描述了相关的工作。在第3部分,我们定义了工作流挖掘和引进了一些挑战性的问题。在第4部分我们提出了一种通用的基于XML的用语存储和交换工作流日志的格式。作者研发的挖掘工具使用这种格式,并且用它作为与那些主流的工作流系统(Staffware、MQSeries Workflow和InConcert)的接口。在第5至9部分主要介绍针对不同应用的5类工作流挖掘方法。这些部分阐述了一些正在进行的与工作流挖掘有关的工作。第10部分队不同的方法作了对比并列举了一些公开的问题。第11部分对整个论文做了总结。[1] This workshop took place on May 22nd and 23rd 2002 in Eindhoven , The Netherlands.
该工作室于2002年5月22和23日在荷兰的艾恩德霍芬创立。