Problems and languages
An alphabet: a finite set of symbols
- Σ={0,1}\Sigma=\{0,1\}Σ={0,1}
- Σ={a,b,c,…,z}\Sigma=\{a,b,c,\dots,z\}Σ={a,b,c,…,z}
A string: a finite sequence of symbols from some alphabet Σ\SigmaΣ
Length of a string www is #symbols in www, noted as ∣w∣|w|∣w∣
- If ∣w∣=0|w|=0∣w∣=0, www is an empty string eee
Σi\Sigma^iΣi: the set of all strings of length iii over Σ\SigmaΣ
- Σ={0,1}\Sigma=\{0,1\}Σ={0,1}
- Σ0={e}\Sigma^0=\{e\}Σ0={e}
- Σ1={0,1}\Sigma^1=\{0,1\}Σ1={0,1}
- Σ2={00,01,10,11}\Sigma^2=\{00,01,10,11\}Σ2={00,01,10,11}
- ⋯\cdots⋯
Σ+\Sigma^+Σ+: the set of all non-empty strings over Σ\SigmaΣ
Σ∗\Sigma^*Σ∗: the set of all strings over Σ\SigmaΣ
- Σ∗=Σ+∪{e}\Sigma^*=\Sigma^+\cup\{e\}Σ∗=Σ+∪{e}
Concatenation: vwvwvw
String exponentiation: wi+1=wwi,w0=ew^{i+1}=ww^i, w^0=ewi+1=wwi,w0=e, defined by induction
Reversal: w=a1a2⋯an,wR=anan−1⋯a1w=a_1a_2\cdots a_n,w^R=a_na_{n-1}\cdots a_1w=a1a2⋯an,wR=anan−1⋯a1
- ∣w∣=0,w=wR=e|w|=0,w=w^R=e∣w∣=0,w=wR=e
- ∣w∣≥1,w=ua(u∈Σ∗,a∈Σ),wR=auR|w|\ge1,w=ua(u\in\Sigma^*,a\in\Sigma),w^R=au^R∣w∣≥1,w=ua(u∈Σ∗,a∈Σ),wR=auR
A set of strings over some alphabet Σ\SigmaΣ is called a language
Any decision problem ⟺ \iff⟺ Problem about certain language
A Deterministic Finite Automata(DFA): a 5-tuple (K,Σ,δ,s,F)(K,\Sigma,\delta,s,F)(K,Σ,δ,s,F)
- KKK: a finite set of states
- Σ\SigmaΣ: a finite set of input symbols
- δ\deltaδ: transition function, K×Σ→KK\times\Sigma\to KK×Σ→K
- s∈Ks\in Ks∈K: initial state
- F⊆KF\subseteq KF⊆K: a set of final states
A configuration of a DFA M=(K,Σ,δ,s,F)M=(K,\Sigma,\delta,s,F)M=(K,Σ,δ,s,F): an element of K×Σ∗K\times\Sigma^*K×Σ∗
Yields in one step: (q,w)⊢M(q′,w′)(q,w)\vdash_M(q',w')(q,w)⊢M(q′,w′), if w=aw′(a∈Σ)w=aw'(a\in\Sigma)w=aw′(a∈Σ) and δ(q,a)=q′\delta(q,a)=q'δ(q,a)=q′
- (q,w)⊢M∗(q′,w′)(q,w)\vdash_M^*(q',w')(q,w)⊢M∗(q′,w′)
MMM accepts w∈Σ∗w\in\Sigma^*w∈Σ∗ if (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)⊢M∗(q,e), for some q∈Fq\in Fq∈F
L(M)L(M)L(M): a set of all w∈Σ∗w\in\Sigma^*w∈Σ∗ accepted by MMM, uniquely
MMM accepts a language LLL iff
- LLL contains every string accepted by MMM
- every string in LLL is accepted by MMM
A language is regular if it is accepted by some DFA
Regular operation
- Union: A∪B={w:w∈A∨w∈B}A\cup B=\{w:w\in A\lor w\in B\}A∪B={w:w∈A∨w∈B}
- Intersection: A∩B={w:w∈A∧w∈B}A\cap B=\{w:w\in A\land w\in B\}A∩B={w:w∈A∧w∈B}
- Complement: A‾={w:w∈Σ∗−A}\overline{A}=\{w:w\in\Sigma^*-A\}A={w:w∈Σ∗−A}
- Concatenation: A∘B={ab:a∈A∧b∈B}A \circ B=\{ab:a\in A\land b\in B\}A∘B={ab:a∈A∧b∈B}
- Star: A∗={w1w2…wn:wi∈A,n∈N}A^*=\{w_1w_2\dots w_n:w_i\in A,n\in N\}A∗={w1w2…wn:wi∈A,n∈N}, e∈A∗e\in A^*e∈A∗
Theorem: If AAA and BBB are regular, so is A∪BA\cup BA∪B (by DFA)
- Idea: ∃M1(K1,Σ,δ1,s1,F1),M2(K2,Σ,δ2,s2,F2)\exists M_1(K_1,\Sigma,\delta_1,s_1,F_1),M_2(K_2,\Sigma,\delta_2,s_2,F_2)∃M1(K1,Σ,δ1,s1,F1),M2(K2,Σ,δ2,s2,F2) accept A,BA,BA,B, respectively, construct M3(K3,Σ,δ3,s3,F3)M_3(K_3,\Sigma,\delta_3,s_3,F_3)M3(K3,Σ,δ3,s3,F3) accepting A∪BA\cup BA∪B
- K3=K1×K2K_3=K_1\times K_2K3=K1×K2
- δ3:K3×Σ→Σ\delta_3:K_3\times\Sigma\to\Sigmaδ3:K3×Σ→Σ, with constraint δ3((q1,q2),w)=(δ1(q1,w),δ2(q2,w))=(q1′,q2′)\delta_3((q_1,q_2),w)=(\delta_1(q_1,w),\delta_2(q_2,w))=(q_1',q_2')δ3((q1,q2),w)=(δ1(q1,w),δ2(q2,w))=(q1′,q2′)
- s3=(s1,s2)s_3=(s_1,s_2)s3=(s1,s2)
- F3={(q1,q2)∈K1×K2:q1∈F1∨q2∈F2}F_3=\{(q_1,q_2)\in K_1\times K_2:q_1\in F_1\lor q_2\in F_2\}F3={(q1,q2)∈K1×K2:q1∈F1∨q2∈F2}
Non-deterministic Finite Automata(NFA)
- several choices for the next state
- may switch states without reading any input symbols
Definition of NFA: a 5-tuple (K,Σ,Δ,s,F)(K,\Sigma,\Delta,s,F)(K,Σ,Δ,s,F)
- KKK: a finite set of states
- Σ\SigmaΣ: a finite set of input symbols
- Δ\DeltaΔ: transition relation, a subset of K×(Σ∪{e})×KK\times(\Sigma\cup\{e\})\times KK×(Σ∪{e})×K
- s∈Ks\in Ks∈K: initial state
- F⊆KF\subseteq KF⊆K: a set of final states
A configuration of a NFA N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F): an element of K×Σ∗K\times\Sigma^*K×Σ∗
Yields in one step: (q,w)⊢M(q′,w′)(q,w)\vdash_M(q',w')(q,w)⊢M(q′,w′), if w=aw′(a∈(Σ∪{e}))w=aw'(a\in(\Sigma\cup\{e\}))w=aw′(a∈(Σ∪{e})) and (q,a,q′)∈Δ(q,a,q')\in\Delta(q,a,q′)∈Δ
- (q,w)⊢M∗(q′,w′)(q,w)\vdash_M^*(q',w')(q,w)⊢M∗(q′,w′)
NNN accepts w∈Σ∗w\in\Sigma^*w∈Σ∗ if (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)⊢M∗(q,e), for some q∈Fq\in Fq∈F
L(N)L(N)L(N): a set of all w∈Σ∗w\in\Sigma^*w∈Σ∗ accepted by NNN, uniquely
DFA is a special case of NFA: function (q,a)→δ(q,a)(q,a)\to\delta(q,a)(q,a)→δ(q,a) can be converted to relation (q,a,δ(q,a))(q,a,\delta(q,a))(q,a,δ(q,a))
Theorem: For any NFA N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F), there always exists DFA M=(K′,Σ,δ,s′,F′)M=(K',\Sigma,\delta,s',F')M=(K′,Σ,δ,s′,F′) s.t. L(N)=L(M)L(N)=L(M)L(N)=L(M)
- K′=P(K)K'=\mathcal{P}(K)K′=P(K)
- ∀q∈K,E(q)={p∈K:(q,e)⊢M∗(p,e)}\forall q\in K, E(q)=\{p\in K:(q,e)\vdash_M^*(p,e)\}∀q∈K,E(q)={p∈K:(q,e)⊢M∗(p,e)}
- ∀Q⊆K,E(Q)=⋃q∈QE(q)\forall Q\subseteq K, E(Q)=\displaystyle\bigcup_{q\in Q}{E(q)}∀Q⊆K,E(Q)=q∈Q⋃E(q)
- δ:K′×Σ→K′\delta:K'\times\Sigma\to K'δ:K′×Σ→K′, with constraint δ(Q,a)=⋃q∈QE({p∈K:(q,a,p)∈Δ})\delta(Q,a)=\displaystyle\bigcup_{q\in Q}{E(\{p\in K:(q,a,p)\in\Delta\})}δ(Q,a)=q∈Q⋃E({p∈K:(q,a,p)∈Δ})
- s′=E(s)s'=E(s)s′=E(s)
- F′={Q⊆K:Q∩F≠∅}F'=\{Q\subseteq K:Q\cap F\neq\varnothing\}F′={Q⊆K:Q∩F=∅}
- claim: ∀p,q∈K\forall p,q\in K∀p,q∈K, ∀w∈Σ\forall w\in\Sigma∀w∈Σ, (p,w)⊢M∗(q,e)(p,w)\vdash_M^*(q,e)(p,w)⊢M∗(q,e) iff (E(p),w)⊢M∗(Q,e)(E(p),w)\vdash_M^*(Q,e)(E(p),w)⊢M∗(Q,e) for some QQQ containing qqq
- MMM accepts w ⟺ Nw\iff Nw⟺N accepts www:
- (s,w)⊢M∗(q,e)(s,w)\vdash_M^*(q,e)(s,w)⊢M∗(q,e) for some q∈Fq\in Fq∈F
- (E(s),w)⊢M∗(Q,e)(E(s),w)\vdash_M^*(Q,e)(E(s),w)⊢M∗(Q,e) with q∈Qq\in Qq∈Q
- note: q∈Q⇔Q∩F≠∅⇔Q∈F′q\in Q \Leftrightarrow Q\cap F\neq\varnothing \Leftrightarrow Q\in F'q∈Q⇔Q∩F=∅⇔Q∈F′
Theorem: If AAA and BBB are regular, so is A∘BA\circ BA∘B
- Idea: ∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2)\exists N_1(K_1,\Sigma,\Delta_1,s_1,F_1),N_2(K_2,\Sigma,\Delta_2,s_2,F_2)∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2) accept A,BA,BA,B, respectively, construct N3(K3,Σ,Δ3,s3,F3)N_3(K_3,\Sigma,\Delta_3,s_3,F_3)N3(K3,Σ,Δ3,s3,F3) accepting A∘BA\circ BA∘B
- K3=K1∪K2K_3=K_1\cup K_2K3=K1∪K2
- Δ3=Δ1∪Δ2∪{(q,e,s2):q∈F1}\Delta_3=\Delta_1\cup\Delta_2\cup\{(q,e,s_2):q\in F_1\}Δ3=Δ1∪Δ2∪{(q,e,s2):q∈F1}
- s3=s1s_3=s_1s3=s1
- F3=F2F_3=F_2F3=F2
Theorem: If AAA is regular, so is A∗A^*A∗
- Idea: ∃N(K,Σ,Δ,s,F)\exists N(K,\Sigma,\Delta,s,F)∃N(K,Σ,Δ,s,F) accept AAA, construct N′(K′,Σ,Δ′,s′,F′)N'(K',\Sigma,\Delta',s',F')N′(K′,Σ,Δ′,s′,F′) accepting A∗A^*A∗
- K′=K∪{s′}K'=K\cup\{s'\}K′=K∪{s′}
- Δ′=Δ∪{(s′,e,s)}∪{(q,e,s):q∈F}\Delta'=\Delta\cup\{(s',e,s)\}\cup\{(q,e,s):q\in F\}Δ′=Δ∪{(s′,e,s)}∪{(q,e,s):q∈F}
- F′=F∪{s′}F'=F\cup\{s'\}F′=F∪{s′}
Theorem: If AAA and BBB are regular, so is A∪BA\cup BA∪B (by NFA)
- Idea: ∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2)\exists N_1(K_1,\Sigma,\Delta_1,s_1,F_1),N_2(K_2,\Sigma,\Delta_2,s_2,F_2)∃N1(K1,Σ,Δ1,s1,F1),N2(K2,Σ,Δ2,s2,F2) accept A,BA,BA,B, respectively, construct N3(K3,Σ,Δ3,s3,F3)N_3(K_3,\Sigma,\Delta_3,s_3,F_3)N3(K3,Σ,Δ3,s3,F3) accepting A∪BA\cup BA∪B
- K3=K1∪K2∪{s3}K_3=K_1\cup K_2\cup\{s_3\}K3=K1∪K2∪{s3}
- Δ3=Δ1∪Δ2∪{(s3,e,s1)}∪{(s3,e,s2)}\Delta_3=\Delta_1\cup\Delta_2\cup\{(s_3,e,s_1)\}\cup\{(s_3,e,s_2)\}Δ3=Δ1∪Δ2∪{(s3,e,s1)}∪{(s3,e,s2)}
- F3=F1∪F2F_3=F_1\cup F_2F3=F1∪F2
Regex: defined inductively on an alphabet Σ\SigmaΣ
- Base case:
- ∅\varnothing∅ is a regex, L(∅)=∅L(\varnothing)=\varnothingL(∅)=∅
- any symbol aaa in Σ\SigmaΣ is a regex, L({a})={a}L(\{a\})=\{a\}L({a})={a}
- Inductive case:
- If R1R_1R1 and R2R_2R2 are regex, so is R1∪R2R_1\cup R_2R1∪R2, L(R1∪R2)=L(R1)∪L(R2)L(R_1\cup R_2)=L(R_1)\cup L(R_2)L(R1∪R2)=L(R1)∪L(R2)
- If R1R_1R1 and R2R_2R2 are regex, so is R1∘R2R_1\circ R_2R1∘R2, L(R1∘R2)=L(R1)∘L(R2)L(R_1\circ R_2)=L(R_1)\circ L(R_2)L(R1∘R2)=L(R1)∘L(R2)
- If RRR is a regex, so is R∗R^*R∗, L(R∗)=L(R)∗L(R^*)=L(R)^*L(R∗)=L(R)∗
- Precedence: ∗>∘>∪*>\circ>\cup∗>∘>∪
- Example:
- L({e})=∅∗L(\{e\})=\varnothing^*L({e})=∅∗
- L({w∈{a,b}∗:wL(\{w\in\{a,b\}^*:wL({w∈{a,b}∗:w starts with aaa and end with b})=a(a∪b)∗bb\})=a(a\cup b)^*bb})=a(a∪b)∗b
- L({w∈{0,1}∗:wL(\{w\in\{0,1\}^*:wL({w∈{0,1}∗:w has at least two occurence of 0})=(0∪1)∗0(0∪1)∗0(0∪1)∗0\})=(0\cup1)^*0(0\cup1)^*0(0\cup1)^*0})=(0∪1)∗0(0∪1)∗0(0∪1)∗
Theorem: A language is regular iff it is described by some regex
- Idea
- Regex →\to→ NFA, composition of operations above
- NFA →\to→ Regex, elimination of states
- Given a NFA NNN
- Convert NNN into an equivalent NFA N′N'N′, such that
- N′N'N′ has no arc entering initial state, by adding a new initial state with an only arc labeled eee to the original initial state
- N′N'N′ has only one final state and there is no arc leaving this final state, by adding a new final state with arcs labeled eee from original final states
- Eliminate, one by one, all the states except the initial state and final state
- Convert NNN into an equivalent NFA N′N'N′, such that
Let N=(K,Σ,Δ,s,F)N=(K,\Sigma,\Delta,s,F)N=(K,Σ,Δ,s,F) be a NFA. WLOG, assume that
- K={q1,q2,…,qn},s=qn−1,F={qn}K=\{q_1,q_2,\dots,q_n\},s=q_{n-1},F=\{q_n\}K={q1,q2,…,qn},s=qn−1,F={qn}
- (p,a,qn−1)∉Δ,∀p∈K,∀a∈Σ∪{e}(p,a,q_{n-1})\notin\Delta,\forall p\in K,\forall a\in\Sigma\cup\{e\}(p,a,qn−1)∈/Δ,∀p∈K,∀a∈Σ∪{e}
- (qn,a,p)∈Δ,∀p∈K,∀a∈Σ∪{e}(q_n,a,p)\in\Delta,\forall p\in K,\forall a\in\Sigma\cup\{e\}(qn,a,p)∈Δ,∀p∈K,∀a∈Σ∪{e}
- Subproblem: ∀i,j∈[1,n]\forall i,j\in[1,n]∀i,j∈[1,n] and ∀k∈[0,n]\forall k\in[0,n]∀k∈[0,n], define
- Lijk={w∈Σ∗:wL_{ij}^k=\{w\in\Sigma^*:wLijk={w∈Σ∗:w drives NNN from qiq_iqi to qjq_jqj without passing any intermediate state having index greater than k}k\}k}
- Rijk→LijkR_{ij}^k\to L_{ij}^kRijk→Lijk
- Goal: L(N)=L(n−1)nn−2L(N)=L_{(n-1)n}^{n-2}L(N)=L(n−1)nn−2
- Base case: k=0k=0k=0, Lij0={{a:(qi,a,qj)∈Δ},i≠j{a:(qi,a,qj)∈Δ}∪{e},i=jL_{ij}^0=\begin{cases}\{a:(q_i,a,q_j)\in\Delta\}&,i\neq j \\ \{a:(q_i,a,q_j)\in\Delta\}\cup\{e\}&,i=j\end{cases}Lij0={{a:(qi,a,qj)∈Δ}{a:(qi,a,qj)∈Δ}∪{e},i=j,i=j
- Recurrence: k≥1k\ge1k≥1, Lijk=Lijk−1∪(Likk−1∘(Lkkk−1)∗∘Lkjk−1)L_{ij}^k=L_{ij}^{k-1}\cup\left( L_{ik}^{k-1}\circ\left(L_{kk}^{k-1}\right)^*\circ L_{kj}^{k-1}\right)Lijk=Lijk−1∪(Likk−1∘(Lkkk−1)∗∘Lkjk−1)
To proof regular:
- By definition: DFA, NFA, Regex
- By closure property: union, intersection, complement, concatenation, star
Pumping Theorem
- DFA defines finite number of states to distinguish finite number of distinct patterns
- DFA may accept infinite strings
- Let LLL be a regular language. There exitsts an integer p≥1p\ge 1p≥1 s.t. any string w∈Lw\in Lw∈L with ∣w∣≥p|w|\ge p∣w∣≥p can be divided into three pieces w=xyzw=xyzw=xyz, satisfying the following conditions:
- ∀i≥0,xyiz∈L\forall i\ge 0, xy^iz\in L∀i≥0,xyiz∈L
- ∣y∣>0|y|>0∣y∣>0
- ∣xy∣≤p⇒∣y∣≤p|xy|\le p \Rightarrow |y|\le p∣xy∣≤p⇒∣y∣≤p
- Proof: Let MMM be a DFA that accepts LLL, ppp be # distinct states of MMM
- If every string in LLL has length less than ppp, i.e. LLL is finite
- Otherwise, let www be any string in LLL with ∣w∣≥p|w|\ge p∣w∣≥p, w=a1a2…an(n≥p)w=a_1a_2\dots a_n(n\ge p)w=a1a2…an(n≥p), which is accepted by a DFA with n+1n+1n+1 states (q0,q1,…,qn)(q_0,q_1,\dots,q_n)(q0,q1,…,qn).
- According to Penguin Hole Theorem, there must exist 0≤i<j≤p0\le i<j\le p0≤i<j≤p such that qi=qjq_i=q_jqi=qj
- Such DFA satisfying:
- ∀i≥0,xyiz∈L\forall i\ge 0,xy^iz\in L∀i≥0,xyiz∈L
- ∣y∣=j−i>0|y|=j-i>0∣y∣=j−i>0
- ∣xy∣=j≤p|xy|=j\le p∣xy∣=j≤p
- regular →\to→ pumping, not pumping →\to→ not regular
Example: Show that L={aibi:i≥0}L=\{a^ib^i:i\ge 0\}L={aibi:i≥0} is not regular
- Assume for the sake of contradiction, LLL is regular
- Let ppp be the pumping length
- Let w=apbp∈Lw=a^pb^p\in Lw=apbp∈L, then www can be written as w=xyzw=xyzw=xyz such that:
- ∀i≥0,xyiz∈L\forall i\ge 0, xy^iz\in L∀i≥0,xyiz∈L
- ∣y∣>0|y|>0∣y∣>0
- ∣xy∣≤p|xy|\le p∣xy∣≤p
- For apbpa^pb^papbp, ∃k≥1\exists k\ge 1∃k≥1, s.t. y=aky=a^ky=ak, therefore xy0z=ap−kbp∉Lxy^0z=a^{p-k}b^p\notin Lxy0z=ap−kbp∈/L, contradicting with assumption
Example: Show that L={w∈{a,b}∗:wL=\{w\in\{a,b\}^*:wL={w∈{a,b}∗:w has equal number of aaa’s and bbb’s }\}} is not regular
- Proof: Assume that LLL is regular, therefore L∩a∗b∗={aibi:i≥0}L\cap a^*b^*=\{a^ib^i:i\ge 0\}L∩a∗b∗={aibi:i≥0} is regular, contradicts.
本文介绍了形式语言和自动机理论的基础概念,包括符号、字符串、语言等,并详细阐述了确定性和非确定性有限自动机的定义及转换方法,同时讨论了正则表达式与这些自动机之间的关系。
1122

被折叠的 条评论
为什么被折叠?



