通过主动学习生成自动机 (A Quick Survey of Active Automata Learning) - wcventure

本文探讨了主动学习自动机的各种方法,涵盖了从确定有限自动机到混合自动机的多种类型,介绍了Angluin的L*算法等核心算法,并讨论了它们在协议、智能卡、遗留软件测试等领域的应用。

A Quick Survey of Active Automata Learning

Remark 1:For Basic theoretical knowledge of the Angluin’s L* Algorithm, the interested reader can refer to this article.
Remark 2:go through the article on github.


CONTENTS

  1. Introduction
  2. Target Automata Types
  3. Approach
  4. Tools
  5. Application
  6. Challenge And Discussion
  7. Reference

ARTICLE

1. Introduction

This is a survey on active automata learning.

Automata learning, or model learning, aims to construct black-box state diagram models of software and hardware systems by providing inputs and observing outputs. In this article, we focus on one specific type of models, namely Automata, which are crucial for understanding the behavior of many software systems. Model inference techniques can be either white box or black box, depending on whether they need access to the code. In this article, we discuss black box techniques. Advantages of these techniques are that they are relatively easy to use and can also be applied in situations where we do not have access to the code or to adequate white box tools. There is a large body of research on learning automata and state machines, which can be divided into two broad categories: learning with queries and answers(active learning), and learning only from examples (passive learning). As a final restriction, we only consider techniques for active learning, that is, techniques that accomplish their task by actively doing experiments (tests) on the software. This survey mainly foucus on active Automata learning, and the related passive learning techniques may be slightly involved.

2. Target Automata Types

The original active automata learning algorithm has originally been presented for Deterministic Finite Automata (DFA), but has since been adapted to Mealy Machines, which are a better fit for learning actual reactive systems as they can encode system output in a natural way. A major and recent increase in expressiveness is achieved with Register Automata (RA) and Buchi Automata (BA).

2.1 Deterministic Finite Automata (DFA)

A deterministic finite automaton M is a 5-tuple, ( Q , Σ , δ , q 0 , F ) (Q, Σ, δ, q0, F) (Q,Σ,δ,q0,F), consisting of

  • a finite set of states (Q)
  • a finite set of input symbols called the alphabet (Σ)
  • a transition function (δ : Q × Σ → Q)
  • an initial or start state (q0 ∈ Q)
  • a set of accept states (F ⊆ Q)

Let w = a1a2 … an be a string over the alphabet Σ. The automaton M accepts the string w if a sequence of states, r0,r1, …, rn, exists in Q with the following conditions:

  1. r0 = q0
  2. ri+1=δ(ri, ai+1), for i = 0, …, n−1
  3. rn∈F.

In words, the first condition says that the machine starts in the start state q0. The second condition says that given each character of string w, the machine will transition from state to state according to the transition function δ. The last condition says that the machine accepts w if the last input of w causes the machine to halt in one of the accepting states. Otherwise, it is said that the automaton rejects the string. The set of strings that M accepts is the language recognized by M and this language is denoted by L(M).

A deterministic finite automaton without accept states and without a starting state is known as a transition system or semiautomaton.

Related Approach

Related Approach - Title
Angluins et al. 1987 Learning regular sets from queries and counterexamples
Rivest and Schapire 1993 Inference of Finite Automata Using Homing Sequences
Kearns and Vazirani 1994 An introduction to computational learning theory
parekh et al. 1997 A polynomial time incremental algorithm for regular grammar inference
Denis et al. 2001 Learning regular languages using RFSAs
Bongard et al. 2005 Active Coevolutionary Learning of Deterministic Finite Automata
Isberner et al. 2014 The TTT Algorithm: A Redundancy-Free Approach to Active Automata Learning
Volpato et al. 2015 Approximate Active Learning of Nondeterministic Input Output Transition Systems

2.2 Nondeterministic Finite Automata (NFA)

In automata theory, a finite state machine is called a deterministic finite automaton (DFA), if

  • each of its transitions is uniquely determined by its source state and input symbol, and
  • reading an input symbol is required for each state transition.

A nondeterministic finite automaton (NFA), or nondeterministic finite state machine, does not need to obey these restrictions. In particular, every DFA is also an NFA. Sometimes the term NFA is used in a narrower sense, referring to a NDFA that is not a DFA, but not in this article.
Using the subset construction algorithm, each NFA can be translated to an equivalent DFA, i.e. a DFA recognizing the same formal language. Like DFAs, NFAs only recognize regular languages.

Related Approach

Related Approach - Title
Oncina et al. 1992 Inferring Regular Languages in Polynomial Updated Time
Dupont et al. 1996 Incremental regular inference

2.3 Moore Machine

In the theory of computation, a Moore machine is a finite-state machine whose output values are determined only by its current state. This is in contrast to a Mealy machine, whose output values are determined both by its current state and by the values of its inputs.
A Moore machine can be defined as a 6-tuple (S, S_0, Σ, Λ, T, G) consisting of the following:

  • a finite set of states S
  • a start state (also called initial state) S_0 which is an element of S
  • a finite set called the input alphabet Σ
  • a finite set called the output alphabet Λ
  • a transition function T:S × Σ → S mapping a state and the input alphabet to the next state
  • an output function G:S → Λ mapping each state to the output alphabet
    A Moore machine can be regarded as a restricted type of finite-state transducer.

Preliminaries

  • Moore E F. Gedanken-Experiments on Sequential Machines[M]// Automata Studies. 1956:129-153.

Related Approach

Related Approach - Title
Georgios et al. 2016 Learning Moore Machines from Input-Output Traces
Moerman et al. 2017 Learning Product Automata

2.4 Mealy Machine

In the theory of computation, a Mealy machine is a finite-state machine whose output values are determined both by its current state and the current inputs. (This is in contrast to a Moore machine, whose output values are determined solely by its current state.) A Mealy machine is a deterministic finite-state transducer: for each state and input, at most one transition is possible.
A Mealy machine is a 6-tuple (S, S_0, Σ, Λ, T, G) consisting of the following:

  • a finite set of states S
  • a start state (also called initial state) S_0 which is an element of S
  • a finite set called the input alphabet Σ
  • a finite set called the output alphabet Λ
  • a transition function T:S × Σ → S mapping pairs of a state and an input symbol to the corresponding next state.
  • an output function G:S × Σ → Λ mapping pairs of a state and an input symbol to the corresponding output symbol.
    In some formulations, the transition and output functions are coalesced into a single function T:S × Σ → S × Λ.

Preliminaries

  • Mealy G H. A method for synthesizing sequential circuits[J]. Bell System Technical Journal, 2013, 34(5):1045-1079.

Related Approach

Related Approach - Title
Shahbaz et al. 2009 Inferring Mealy Machines
Aarts et al. 2010 Learning I/O Automata
Steffen et al. 2011 Introduction to Active Automata Learning from a Practical Perspective

2.5 Register Automata (RA)

Register Automata are an extension of finite automata with data from in finite domains and are, e.g., well-suited for describing communication protocols. Register Automata are defied as follows:

Definition 1. Let a symbolic input be a pair (a; p¯), of a parameterized input a of arity k and a sequence of symbolic parameters p¯ = <p1, …, pk> Let further X = <x1, …, xm> be a finite set of registers. A guard is a conjunction of equalities and negated equalities, e.g., pi != xj, over formal parameters and registers. An assignment is a partial mapping ρ : X → X ∪ P for a set P of formal parameters.

Defiition 2. A Register Automaton (RA) is a tuple A* = (A, L, l0, X, Γ, λ), where

  • A is a finite set of actions.
  • L is a finite set of locations.
  • l0 ∈ L is the initial location.
  • X is a finite set of registers.
  • Γ is a finite set of transitions, each of which is of form h<l, (a, p¯), g, ρ, l’>, where l is the source location, l’ is the target location, (a, p¯) is a parameterized action, g is a guard, and ρ is an assignment.
  • λ : L → {+, -} maps each location to either + (accept) or - (reject).

Let us define the semantics of an RA A* = (A, L, l0, X, Γ, λ). A X-valuation, denoted by v, is a (partial) mapping from X to D. A state of A* is a pair <l, v> where l ∈ L and v is a X-valuation. The initial state is <l0, v0>, i.e., the pair of initial location and empty valuation.

A step of A*, denoted by <l, v> -(a,d¯)→ <l’, v’>, transfers A* from <l, v> to <l0, v0> on input (a, d¯) if there is a transition <l, (a, p¯), g, ρ, l’> ∈ Γ such that (1) g is modeled by d¯ and v, i.e., if it becomes true when replacing all pi by di and all xi by v(xi), and such that (2) v’ is the updated X-valuation, where v’(xi) = v(xj) wherever ρ(xi) = xj, and v’(xi) = dj wherever ρ(xi) = pj.

Related Approach

Related Approach - Title
Howar et al. 2012 Inferring Canonical Register Automata
Cassel et al. 2014 Active learning for extended finite state machines
Aarts et al. 2015 Learning Register Automata with Fresh Value Generation

2.6 Büchi Automata

In computer science and automata theory, a Büchi automaton is a type of ω-automaton, which extends a finite automaton to infinite inputs. It accepts an infinite input sequence if there exists a run of the automaton that visits (at least) one of the final states infinitely often. Büchi automata recognize the omega-regular languages, the infinite word version of regular languages. It is named after the Swiss mathematician Julius Richard Büchi who invented this kind of automaton in 1962.
Büchi automata are often used in model checking as an automata-theoretic version of a formula in linear temporal logic.

Formally, a deterministic Büchi automaton is a tuple A = (Q, Σ, δ, q0, F) that consists of the following components:

  • Q is a finite set. The elements of Q are called the states of A.
  • Σ is a finite set called the alphabet of A.
  • δ: Q × Σ → Q is a function, called the transition function of A.
  • q0 is an element of Q, called the initial state of A.
  • F⊆Q is the set of accepting states. A accepts exactly those runs in which at least one of the infinitely often occurring states is in F.

In a non-deterministic Büchi automaton, the transition function δ is replaced with a transition relation Δ that returns a set of states, and the single initial state q0 is replaced by a set I of initial states. Generally, the term Büchi automaton without qualifier refers to non-deterministic Büchi automata.

Preliminaries

  • Büchi J R. On a Decision Method in Restricted Second Order Arithmetic[M]// The Collected Works of J. Richard Büchi. Springer New York, 1990:511-8.
  • Calbrix H, Nivat M, Podelski A. Ultimately periodic words of rational ω -languages[J]. Comptes Rendus de l Académie des Sciences - Series I - Mathematics, 1993, 802(5):554-566.
  • Farwer B. ω-Automata[M]// Automata Logics, and Infinite Games. Springer Berlin Heidelberg, 2002:3-21.

Related Approach

Related Approach - Title
Maler and Pnueli 1995 On the learnability of infinitary regular sets
Farzan et al. 2008 Extending Automated Compositional Verification to the Full Class of Omega-Regular Languages
Angluin et al. 2014 Learning Regular Omega Languages
Li et al. 2017 A Novel Learning Algorithm for Büchi Automata Based on Family of DFAs and Classification Trees

2.7 Nominal Automata

Nominal automata are automata for infinite alphabets uses the notion of nominal sets.
Consider now an infinite alohabet A = {a, b, c, d, … }. The language L1 becomes {aa, bb, cc, dd, …}. Classical theory of finite automata does not apply to this kind of languages, but one may draw an infinite deterministic automaton that recognizes L1. This automaton ostensibly have infinitely many states, but the set of states can be finitely presented in a way open to effective manipulation. More specifically, in a nominal automaton the set of states is subject to an action of permutations of a set of atoms, and it is finite up to that action.

Preliminaries

  • Bojańczyk M, Klin B, Lasota S. Automata theory in nominal sets[J]. Logical Methods in Computer Science, 2014, 10(3).

Related Approach

Related Approach - Title
Moerman et al. 2017 Learning nominal automata

2.8 Timed Automata

In automata theory, a timed automaton is a finite automaton extended with a finite set of real-valued clocks. During a run of a timed automaton, clock values increase all with the same speed. Along the transitions of the automaton, clock values can be compared to integers. These comparisons form guards that may enable or disable transitions and by doing so constrain the possible behaviors of the automaton. Further, clocks can be reset. Timed automata are a sub-class of a type hybrid automata.

Formally, a timed automaton is a tuple A = (Q,Σ,C,E,q0) that consists of the following components:

  • Q is a finite set. The elements of Q are called the states of A.
  • Σ is a finite set called the alphabet or actions of A.
  • C is a finite set called the clocks of A.
  • E ⊆ Q × Σ × B© × P© × Q is a set of edges, called transitions of A, where
    • B© is the set of boolean clock constraints involving clocks from C, and
    • P© is the powerset of C.
  • q0 is an element of Q, called the initial state.
    An edge (q,a,g,r,q’) from E is a transition from state q to q’ with action a, guard g and clock resets r.

Preliminaries

  • Alur R, Dill D L. A theory of timed automata[M]. Elsevier Science Publishers Ltd. 1994.
  • Bengtsson J, Yi W. Timed Automata: Semantics, Algorithms and Tools[J]. Lectures on Concurrency & Petri Nets, 2004, 3098:87-124.

Related Approach

Related Approach - Title
Maier et al. 2014 Online passive learning of timed automata for cyber-physical production systems

2.9 Weighted Automata

Weighted finite automata (WFA) are finite automata whose transitions and states are augmented with some weights, elements of a semiring. A WFA induces a function over strings. The value it assigns to an input string is the semiring sum of the weights of all paths labeled with that string, where the weight of a path is obtained by taking the semiring product of the weights of its constituent transitions, as well as those of its origin and destination states.

Preliminaries

  • Mohri M. Weighted Finite-State Transducer Algorithms. An Overview[M]// Formal Languages and Applications. Springer Berlin Heidelberg, 2004:551-563.
  • Mohri M. Weighted Automata Algorithms[M]// Handbook of Weighted Automata. 2009:213-254.

Related Approach

Related Approach - Title
Balle et al. 2015 Learning Weighted Automata

2.10 Hybrid Automata

In automata theory, a hybrid automaton (plural: hybrid automata or hybrid automatons) is a mathematical model for precisely describing systems in which digital computational processes interact with analog physical processes. A hybrid automaton is a finite state machine with a finite set of continuous variables whose values are described by a set of ordinary differential equations. This combined specification of discrete and continuous behaviors enables dynamic systems that comprise both digital and analog components to be modeled and analyzed.

An Alur-Henzinger hybrid H comprises the following components:

  • A finite set X = {x_1, …, x_n} of real-numbered variables. The number n is called the dimension of H. Let dot(X) be the set {dot(x_1), . . . , dot(x_n)} of dotted variables that represent first derivatives during continuous change, and let X’ be the set {x’_1, …, x’_n} of primed variables that represent values at the conclusion of discrete change.
  • A finite multidigraph (V, E). The vertices in V are called control modes. The edges in E are called control switches.
  • Three vertex labeling functions init, inv, and flow that assign to each control mode v ∈ V three predicates. Each initial condition init(v) is a predicate whose free variables are from X. Each invariant condition inv(v) is a predicate whose free variables are from X. Each flow condition flow(v) is a predicate whose free variables are from X∪dot(X).

So this is a labeled multidigraph.

  • An edge labeling function jump that assigns to each control switch e ∈ E a predicate. Each jump condition jump(e) is a predicate whose free variables are from X∪X’.
  • A finite set Σ of events, and an edge labeling function event: E → Σ that assigns to each control switch an event.

Preliminaries

  • Henzinger T A. The Theory of Hybrid Automata[M]// Verification of Digital and Hybrid Systems. Springer Berlin Heidelberg, 2000:278-292.

Related Approach

Related Approach - Title
Medhat et al. 2015 A framework for mining hybrid automata from input/output traces

2.11 Symbolic Automata (Sigma3 TBD)

The following content comes from http://pages.cs.wisc.edu/~loris/symbolicautomata.html

Classic automata theory builds on the assumption that the alphabet is finite. Unfortunately, practical applications such as XML processing and program trace analysis use values for individual symbols that are typically drawn from an infinite domain. Even when the alphabet is finite, classic automata may sometimes be a bad choice: for example a deterministic finite automata modelling a language over the UTF16 alphabet requires 2^16 transitions out of each state!

What are Symbolic Automata and Transducers?

Symbolic Finite Automata (SFAs) are finite state automata in which the alphabet is given by a Boolean algebra that may have an infinite domain, and transitions are labeled with first-order predicates over such algebra. For example a symbolic automaton (shown on the right) can define the following property:

OddG1 = {l | l is a list of odd numbers with length greater than 1}

In order for SFAs to be closed under Boolean operations and preserve decidability of equivalence, it should be decidable to check whether predicates in the algebra are satisfiable. In the example above predicates are expressed in Presburger arithmetic which is indeed a decidable theory closed under Boolean operations. Symbolic Finite Transducers (SFTs) extend SFAs to output lists. In a SFT transitions, upon reading an input symbol, can compute an output that is expressed as a function of the input being read. Such a function has to belong to the underlying alphabet theory. Many variants of SFAs and SFTs have been proposed, and this page tries to keep up with such extensions.

How do they relate to classic Automata?

Symbolic Finite Automata are strictly more expressive than deterministic finite automata. Despite this fact, Symbolic Finite Automata are closed under Boolean operations and admit decidable equivalence. In general for large alphabets Symbolic Automata outperforms their classic counterpart. In fact even complex regular expressions over UTF16 can be analyzed using symbolic automata.

References

We recommend reading this paper to get started. You can also watch this talk. The purpose of this page is to keep track of the latest results related to this topic. Email me (loris at cs.wisc.edu) with comments and/or suggested additions.

Decision Problems and Closure Properties
  • Symbolic Finite State Transducers: Algorithms and Applications, N, Bjorner, P. Hooimeijer, B. Livshits, D. Molnar, M. Veanes, POPL12 [PDF]<
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值