计算理论笔记(一)DFA NFA Regex

本文介绍了形式语言和自动机理论的基础概念,包括符号、字符串、语言等,并详细阐述了确定性和非确定性有限自动机的定义及转换方法,同时讨论了正则表达式与这些自动机之间的关系。

Problems and languages

An alphabet: a finite set of symbols
  • Σ={0,1}\Sigma=\{0,1\}Σ={0,1}
  • Σ={a,b,c,…,z}\Sigma=\{a,b,c,\dots,z\}Σ={a,b,c,,z}
A string: a finite sequence of symbols from some alphabet Σ\SigmaΣ
Length of a string www is #symbols in www, noted as ∣w∣|w|w
  • If ∣w∣=0|w|=0w=0, www is an empty string eee
Σi\Sigma^iΣi: the set of all strings of length iii over Σ\SigmaΣ
  • Σ={0,1}\Sigma=\{0,1\}Σ={0,1}
  • Σ0={e}\Sigma^0=\{e\}Σ0={e}
  • Σ1={0,1}\Sigma^1=\{0,1\}Σ1={0,1}
  • Σ2={00,01,10,11}\Sigma^2=\{00,01,10,11\}Σ2={00,01,10,11}
  • ⋯\cdots
Σ+\Sigma^+Σ+: the set of all non-empty strings over Σ\SigmaΣ
Σ∗\Sigma^*Σ: the set of all strings over Σ\SigmaΣ
  • Σ∗=Σ+∪{e}\Sigma^*=\Sigma^+\cup\{e\}Σ=Σ+{e}
Concatenation: vwvwvw
String exponentiation: wi+1=wwi,w0=ew^{i+1}=ww^i, w^0=ewi+1=wwi,w0=e, defined by induction
Reversal: w=a1a2⋯an,wR=anan−1⋯a1w=a_1a_2\cdots a_n,w^R=a_na_{n-1}\cdots a_1w=a1a2an,wR=anan1a1
  • ∣w∣=0,w=wR=e|w|=0,w=w^R=ew=0,w=wR=e
  • ∣w∣≥1,w=ua(u∈Σ∗,a∈Σ),wR=auR|w|\ge1,w=ua(u\in\Sigma^*,a\in\Sigma),w^R=au^Rw1,w=ua(uΣ,aΣ),wR=auR
A set of strings over some alphabet Σ\SigmaΣ is called a language
Any decision problem   ⟺  \iff Problem about certain language
A Deterministic Finite Automata(DFA): a 5-tuple (K,Σ,δ,s,F)(K,\Sigma,\delta,s,F)(K,Σ,δ,s,F)
  • KKK: a finite set of states
  • Σ\SigmaΣ: a finite set of input symbols
  • δ\deltaδ: transition function, K×Σ→KK\times\Sigma\to KK×ΣK
  • s∈Ks\in KsK: initial state
  • F⊆KF\subseteq KFK: a set of final states
A configuration of a DFA M=(K,Σ,δ,s,F)M=(K,\Sigma,\delta,s,F)M=(K,Σ,δ,s,F): an element of K×Σ∗K\times\Sigma^*K×Σ
Yields in one step: (q,w)⊢M(q′,w′)(q,w)\vdash_M(q',w')(q,w)M(q,w), if w=aw′(a∈Σ)w=aw'(a\in\Sigma)w=aw(aΣ) and δ(q,a)=q′\delta(q,a)=q'δ(q,a)=q
  • (q,w)⊢M∗(q′,w′)(q,w)\vdash_M^*(q',w')(q,w)M(q,w)
MMM accepts w∈Σ∗w\in\Sigma^*wΣ if (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)M(q,e), for some q∈Fq\in FqF
L(M)L(M)L(M): a set of all w∈Σ∗w\in\Sigma^*wΣ accepted by MMM, uniquely
MMM accepts a language LLL iff
  • LLL contains every string accepted by MMM
  • every string in LLL is accepted by MMM
A language is regular if it is accepted by some DFA
Regular operation
  • Union: A∪B={w:w∈A∨w∈B}A\cup B=\{w:w\in A\lor w\in B\}AB={w:wAwB}
  • Intersection: A∩B={w:w∈A∧w∈B}A\cap B=\{w:w\in A\land w\in B\}AB={w:wAwB}
  • Complement: A‾={w:w∈Σ∗−A}\overline{A}=\{w:w\in\Sigma^*-A\}A={w:wΣA}
  • Concatenation: A∘B={ab:a∈A∧b∈B}A \circ B=\{ab:a\in A\land b\in B\}AB={ab:aAbB}
  • Star: A∗={w1w2…wn:wi∈A,n∈N}A^*=\{w_1w_2\dots w_n:w_i\in A,n\in N\}A={w1w2wn:wiA,nN}, e∈A∗e\in A^*eA
Theorem: If AAA and BBB are regular, so is A∪BA\cup BAB (by DFA)
  • Idea: ∃M1(K1,Σ,δ1,s1,F1),M2(K2,Σ,δ2,s2,F2)\exists M_1(K_1,\Sigma,\delta_1,s_1,F_1),M_2(K_2,\Sigma,\delta_2,s_2,F_2)M1(K1,Σ,δ1,s1,F1),M2(K2,Σ,δ2,s2,F2) accept A,BA,BA,B, respectively, construct M3(K3,Σ,δ3,s3,F3)M_3(K_3,\Sigma,\delta_3,s_3,F_3)M3(K3,Σ,δ3,s3,F3) accepting A∪BA\cup BAB
  • K3=K1×K2K_3=K_1\times K_2K3=K1×K2
  • δ3:K3×Σ→Σ\delta_3:K_3\times\Sigma\to\Sigmaδ3:K3×ΣΣ, with constraint δ3((q1,q2),w)=(δ1(q1,w),δ2(q2,w))=(q1′,q2′)\delta_3((q_1,q_2),w)=(\delta_1(q_1,w),\delta_2(q_2,w))=(q_1',q_2')δ3((q1,q2),w)=(δ1(q1,w),δ2(q2,w))=(q1,q2)
  • s3=(s1,s2)s_3=(s_1,s_2)s3=(s1,s2)
  • F3={(q1,q2)∈K1×K2:q1∈F1∨q2∈F2}F_3=\{(q_1,q_2)\in K_1\times K_2:q_1\in F_1\lor q_2\in F_2\}F3={(q1,q2)K1×K2:q1F1q2F2}
Non-deterministic Finite Automata(NFA)
  • several choices for the next state
  • may switch states without reading any input symbols
Definition of NFA: a 5-tuple (K,Σ,Δ,s,F)(K,\Sigma,\Delta,s,F)(K,Σ,Δ,s,F)
  • KKK: a finite set of states
  • Σ\SigmaΣ: a finite set of input symbols
  • Δ\DeltaΔ: transition relation, a subset of K×(Σ∪{e})×KK\times(\Sigma\cup\{e\})\times KK×(Σ{e})×K
  • s∈Ks\in KsK: initial state
  • F⊆KF\subseteq KFK: a set of final states
A configuration of a NFA N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F): an element of K×Σ∗K\times\Sigma^*K×Σ
Yields in one step: (q,w)⊢M(q′,w′)(q,w)\vdash_M(q',w')(q,w)M(q,w), if w=aw′(a∈(Σ∪{e}))w=aw'(a\in(\Sigma\cup\{e\}))w=aw(a(Σ{e})) and (q,a,q′)∈Δ(q,a,q')\in\Delta(q,a,q)Δ
  • (q,w)⊢M∗(q′,w′)(q,w)\vdash_M^*(q',w')(q,w)M(q,w)
NNN accepts w∈Σ∗w\in\Sigma^*wΣ if (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)M(q,e), for some q∈Fq\in FqF
L(N)L(N)L(N): a set of all w∈Σ∗w\in\Sigma^*wΣ accepted by NNN, uniquely
DFA is a special case of NFA: function (q,a)→δ(q,a)(q,a)\to\delta(q,a)(q,a)δ(q,a) can be converted to relation (q,a,δ(q,a))(q,a,\delta(q,a))(q,a,δ(q,a))
Theorem: For any NFA N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F), there always exists DFA M=(K′,Σ,δ,s′,F′)M=(K',\Sigma,\delta,s',F')M=(K,Σ,δ,s,F) s.t. L(N)=L(M)L(N)=L(M)L(N)=L(M)
  • K′=P(K)K'=\mathcal{P}(K)K=P(K)
  • ∀q∈K,E(q)={p∈K:(q,e)⊢M∗(p,e)}\forall q\in K, E(q)=\{p\in K:(q,e)\vdash_M^*(p,e)\}qK,E(q)={pK:(q,e)M(p,e)}
  • ∀Q⊆K,E(Q)=⋃q∈QE(q)\forall Q\subseteq K, E(Q)=\displaystyle\bigcup_{q\in Q}{E(q)}QK,E(Q)=qQE(q)
  • δ:K′×Σ→K′\delta:K'\times\Sigma\to K'δ:K×ΣK, with constraint δ(Q,a)=⋃q∈QE({p∈K:(q,a,p)∈Δ})\delta(Q,a)=\displaystyle\bigcup_{q\in Q}{E(\{p\in K:(q,a,p)\in\Delta\})}δ(Q,a)=qQE({pK:(q,a,p)Δ})
  • s′=E(s)s'=E(s)s=E(s)
  • F′={Q⊆K:Q∩F≠∅}F'=\{Q\subseteq K:Q\cap F\neq\varnothing\}F={QK:QF=}
  • claim: ∀p,q∈K\forall p,q\in Kp,qK, ∀w∈Σ\forall w\in\SigmawΣ, (p,w)⊢M∗(q,e)(p,w)\vdash_M^*(q,e)(p,w)M(q,e) iff (E(p),w)⊢M∗(Q,e)(E(p),w)\vdash_M^*(Q,e)(E(p),w)M(Q,e) for some QQQ containing qqq
  • MMM accepts w  ⟺  Nw\iff NwN accepts www:
    • (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)M(q,e) for some q∈Fq\in FqF
    • (E(s),w)⊢M∗(Q,e)(E(s),w)\vdash_M^*(Q,e)(E(s),w)M(Q,e) with q∈Qq\in QqQ
    • note: q∈Q⇔Q∩F≠∅⇔Q∈F′q\in Q \Leftrightarrow Q\cap F\neq\varnothing \Leftrightarrow Q\in F'qQQF=QF
Theorem: If AAA and BBB are regular, so is A∘BA\circ BAB
  • Idea: ∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2)\exists N_1(K_1,\Sigma,\Delta_1,s_1,F_1),N_2(K_2,\Sigma,\Delta_2,s_2,F_2)N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2) accept A,BA,BA,B, respectively, construct N3(K3,Σ,Δ3,s3,F3)N_3(K_3,\Sigma,\Delta_3,s_3,F_3)N3(K3,Σ,Δ3,s3,F3) accepting A∘BA\circ BAB
  • K3=K1∪K2K_3=K_1\cup K_2K3=K1K2
  • Δ3=Δ1∪Δ2∪{(q,e,s2):q∈F1}\Delta_3=\Delta_1\cup\Delta_2\cup\{(q,e,s_2):q\in F_1\}Δ3=Δ1Δ2{(q,e,s2):qF1}
  • s3=s1s_3=s_1s3=s1
  • F3=F2F_3=F_2F3=F2
Theorem: If AAA is regular, so is A∗A^*A
  • Idea: ∃N(K,Σ,Δ,s,F)\exists N(K,\Sigma,\Delta,s,F)N(K,Σ,Δ,s,F) accept AAA, construct N′(K′,Σ,Δ′,s′,F′)N'(K',\Sigma,\Delta',s',F')N(K,Σ,Δ,s,F) accepting A∗A^*A
  • K′=K∪{s′}K'=K\cup\{s'\}K=K{s}
  • Δ′=Δ∪{(s′,e,s)}∪{(q,e,s):q∈F}\Delta'=\Delta\cup\{(s',e,s)\}\cup\{(q,e,s):q\in F\}Δ=Δ{(s,e,s)}{(q,e,s):qF}
  • F′=F∪{s′}F'=F\cup\{s'\}F=F{s}
Theorem: If AAA and BBB are regular, so is A∪BA\cup BAB (by NFA)
  • Idea: ∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2)\exists N_1(K_1,\Sigma,\Delta_1,s_1,F_1),N_2(K_2,\Sigma,\Delta_2,s_2,F_2)N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2) accept A,BA,BA,B, respectively, construct N3(K3,Σ,Δ3,s3,F3)N_3(K_3,\Sigma,\Delta_3,s_3,F_3)N3(K3,Σ,Δ3,s3,F3) accepting A∪BA\cup BAB
  • K3=K1∪K2∪{s3}K_3=K_1\cup K_2\cup\{s_3\}K3=K1K2{s3}
  • Δ3=Δ1∪Δ2∪{(s3,e,s1)}∪{(s3,e,s2)}\Delta_3=\Delta_1\cup\Delta_2\cup\{(s_3,e,s_1)\}\cup\{(s_3,e,s_2)\}Δ3=Δ1Δ2{(s3,e,s1)}{(s3,e,s2)}
  • F3=F1∪F2F_3=F_1\cup F_2F3=F1F2
Regex: defined inductively on an alphabet Σ\SigmaΣ
  • Base case:
    • ∅\varnothing is a regex, L(∅)=∅L(\varnothing)=\varnothingL()=
    • any symbol aaa in Σ\SigmaΣ is a regex, L({a})={a}L(\{a\})=\{a\}L({a})={a}
  • Inductive case:
    • If R1R_1R1 and R2R_2R2 are regex, so is R1∪R2R_1\cup R_2R1R2, L(R1∪R2)=L(R1)∪L(R2)L(R_1\cup R_2)=L(R_1)\cup L(R_2)L(R1R2)=L(R1)L(R2)
    • If R1R_1R1 and R2R_2R2 are regex, so is R1∘R2R_1\circ R_2R1R2, L(R1∘R2)=L(R1)∘L(R2)L(R_1\circ R_2)=L(R_1)\circ L(R_2)L(R1R2)=L(R1)L(R2)
    • If RRR is a regex, so is R∗R^*R, L(R∗)=L(R)∗L(R^*)=L(R)^*L(R)=L(R)
  • Precedence: ∗>∘>∪*>\circ>\cup>>
  • Example:
    • L({e})=∅∗L(\{e\})=\varnothing^*L({e})=
    • L({w∈{a,b}∗:wL(\{w\in\{a,b\}^*:wL({w{a,b}:w starts with aaa and end with b})=a(a∪b)∗bb\})=a(a\cup b)^*bb})=a(ab)b
    • L({w∈{0,1}∗:wL(\{w\in\{0,1\}^*:wL({w{0,1}:w has at least two occurence of 0})=(0∪1)∗0(0∪1)∗0(0∪1)∗0\})=(0\cup1)^*0(0\cup1)^*0(0\cup1)^*0})=(01)0(01)0(01)
Theorem: A language is regular iff it is described by some regex
  • Idea
    • Regex →\to NFA, composition of operations above
    • NFA →\to Regex, elimination of states
  • Given a NFA NNN
    • Convert NNN into an equivalent NFA N′N'N, such that
      • N′N'N has no arc entering initial state, by adding a new initial state with an only arc labeled eee to the original initial state
      • N′N'N has only one final state and there is no arc leaving this final state, by adding a new final state with arcs labeled eee from original final states
    • Eliminate, one by one, all the states except the initial state and final state
Let N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F) be a NFA. WLOG, assume that
  • K={q1,q2,…,qn},s=qn−1,F={qn}K=\{q_1,q_2,\dots,q_n\},s=q_{n-1},F=\{q_n\}K={q1,q2,,qn},s=qn1,F={qn}
  • (p,a,qn−1)∉Δ,∀p∈K,∀a∈Σ∪{e}(p,a,q_{n-1})\notin\Delta,\forall p\in K,\forall a\in\Sigma\cup\{e\}(p,a,qn1)/Δ,pK,aΣ{e}
  • (qn,a,p)∈Δ,∀p∈K,∀a∈Σ∪{e}(q_n,a,p)\in\Delta,\forall p\in K,\forall a\in\Sigma\cup\{e\}(qn,a,p)Δ,pK,aΣ{e}
  • Subproblem: ∀i,j∈[1,n]\forall i,j\in[1,n]i,j[1,n] and ∀k∈[0,n]\forall k\in[0,n]k[0,n], define
    • Lijk={w∈Σ∗:wL_{ij}^k=\{w\in\Sigma^*:wLijk={wΣ:w drives NNN from qiq_iqi to qjq_jqj without passing any intermediate state having index greater than k}k\}k}
    • Rijk→LijkR_{ij}^k\to L_{ij}^kRijkLijk
  • Goal: L(N)=L(n−1)nn−2L(N)=L_{(n-1)n}^{n-2}L(N)=L(n1)nn2
  • Base case: k=0k=0k=0, Lij0={{a:(qi,a,qj)∈Δ},i≠j{a:(qi,a,qj)∈Δ}∪{e},i=jL_{ij}^0=\begin{cases}\{a:(q_i,a,q_j)\in\Delta\}&,i\neq j \\ \{a:(q_i,a,q_j)\in\Delta\}\cup\{e\}&,i=j\end{cases}Lij0={{a:(qi,a,qj)Δ}{a:(qi,a,qj)Δ}{e},i=j,i=j
  • Recurrence: k≥1k\ge1k1, Lijk=Lijk−1∪(Likk−1∘(Lkkk−1)∗∘Lkjk−1)L_{ij}^k=L_{ij}^{k-1}\cup\left( L_{ik}^{k-1}\circ\left(L_{kk}^{k-1}\right)^*\circ L_{kj}^{k-1}\right)Lijk=Lijk1(Likk1(Lkkk1)Lkjk1)
To proof regular:
  • By definition: DFA, NFA, Regex
  • By closure property: union, intersection, complement, concatenation, star
Pumping Theorem
  • DFA defines finite number of states to distinguish finite number of distinct patterns
  • DFA may accept infinite strings
  • Let LLL be a regular language. There exitsts an integer p≥1p\ge 1p1 s.t. any string w∈Lw\in LwL with ∣w∣≥p|w|\ge pwp can be divided into three pieces w=xyzw=xyzw=xyz, satisfying the following conditions:
    • ∀i≥0,xyiz∈L\forall i\ge 0, xy^iz\in Li0,xyizL
    • ∣y∣>0|y|>0y>0
    • ∣xy∣≤p⇒∣y∣≤p|xy|\le p \Rightarrow |y|\le pxypyp
  • Proof: Let MMM be a DFA that accepts LLL, ppp be # distinct states of MMM
    • If every string in LLL has length less than ppp, i.e. LLL is finite
    • Otherwise, let www be any string in LLL with ∣w∣≥p|w|\ge pwp, w=a1a2…an(n≥p)w=a_1a_2\dots a_n(n\ge p)w=a1a2an(np), which is accepted by a DFA with n+1n+1n+1 states (q0,q1,…,qn)(q_0,q_1,\dots,q_n)(q0,q1,,qn).
    • According to Penguin Hole Theorem, there must exist 0≤i<j≤p0\le i<j\le p0i<jp such that qi=qjq_i=q_jqi=qj
    • Such DFA satisfying:
      • ∀i≥0,xyiz∈L\forall i\ge 0,xy^iz\in Li0,xyizL
      • ∣y∣=j−i>0|y|=j-i>0y=ji>0
      • ∣xy∣=j≤p|xy|=j\le pxy=jp
  • regular →\to pumping, not pumping →\to not regular
Example: Show that L={aibi:i≥0}L=\{a^ib^i:i\ge 0\}L={aibi:i0} is not regular
  • Assume for the sake of contradiction, LLL is regular
  • Let ppp be the pumping length
  • Let w=apbp∈Lw=a^pb^p\in Lw=apbpL, then www can be written as w=xyzw=xyzw=xyz such that:
    • ∀i≥0,xyiz∈L\forall i\ge 0, xy^iz\in Li0,xyizL
    • ∣y∣>0|y|>0y>0
    • ∣xy∣≤p|xy|\le pxyp
  • For apbpa^pb^papbp, ∃k≥1\exists k\ge 1k1, s.t. y=aky=a^ky=ak, therefore xy0z=ap−kbp∉Lxy^0z=a^{p-k}b^p\notin Lxy0z=apkbp/L, contradicting with assumption
Example: Show that L={w∈{a,b}∗:wL=\{w\in\{a,b\}^*:wL={w{a,b}:w has equal number of aaa’s and bbb’s }\}} is not regular
  • Proof: Assume that LLL is regular, therefore L∩a∗b∗={aibi:i≥0}L\cap a^*b^*=\{a^ib^i:i\ge 0\}Lab={aibi:i0} is regular, contradicts.
评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值