I have made DFA from a given regular expression to match the test string. There are some cases in which .*
occurs. ( for example .*ab
) . Let say now the machine is in state 1. In the DFA, .*
refers to the transition for all the characters to itself and another transition for a from the state 1 for 'a'. If test string contains 'a' then what could be the transition because from state 1, machine can go to two states that is not possible in DFA.
相关问题
- Improve converting string to readable urls
- Regex to match charset
- Regex subsequence matching
- Accommodate two types of quotes in a regex
- Set together letters and numbers that are ordinal
相关文章
- Optimization techniques for backtracking regex imp
- Regex to check for new line
- Allow only 2 decimal points entry to a textbox usi
- Comparing speed of non-matching regexp
- Regular expression to get URL in string swift with
- 请问如何删除之前和之后的非字母中文单字
- Lazy (ungreedy) matching multiple groups using reg
- when [:punct:] is too much [duplicate]
Matches with such regular expressions happen via backtracking. When there is an ambiguity about the next state, the evaluation takes the first choice and remembers it made the choice. If taking the first choice results in a failure to match, the evaluation backtracks to the last choice it made and tries the next available choice from that state.
I'm not sure such a mechanism maps to a strict DFA.
I start with fundamental with your example so that one can find it helpful
Any class of automata can have two forms:
In Deterministic model: we only have single choice (or say no choice) to move from one congratulation to next configuration.
In Deterministic model of Finite Automate (DFA): for every possible combination of state (Q) and language symbol (Σ), we always have unique next state.
Definition of transition function for DFA: δ:Q×Σ → Q
So, In DFA every possible move is definite from one state to next state.
Whereas,
In Non-Deterministic model: we can have more than one choice for next configuration.
And in Non-deterministaic model of Finite Automata (NFA): output is set of states for some combination of state (Q) and language symbol (Σ).
Definition of transition function for NFA: δ:Q×Σ → 2Q = ⊆ Q
In NFA, we can have more then one choice for next state. That is you calls ambiguity in transition NFA.
(your example)
Suppose language symbols are
Σ = {a, b}
and the language/regular expression is(a + b)*ab
. The finite automata for this language you down might be probably like below:How to process string in NFA?
I am considering automata model as an acceptor that accept a string if it belong to the language of automata.(Notice: we can have an automaton as a transducer), below is my answer with an above example
In above NFA, we find 5 tapular objects:
The exampled finite automata is an actually an NFA because in production rule
δ(q1, a) → { q1, q2 }
, if we geta
symbol while present state isq1
then next states can be eitherq1
orq2
(more than one choices). So when we process a string in NFA, we get extra path to travel wherever their is a symbola
to be process while current state isq1
.A string is accepted by an NFA, if there is some sequence of possible moves that will put the machine in a final state at the end of string processing. And the set of all string those have some path to reach to any final state in set
F
from initial state is called language of NFA:We can also write, "what is language defined by a NFA?" as:
when I was new, this was too complex to understand to me but its really not
L(nfa) says: all strings consists of language symbols = (w ⊆ Σ* ) are in language; if
(|)
the set of states get after processing ofw
form initial state (=δ*(q1, w) ) contains some states in the set of Final states (hence intersection with final states is not empty = δ*(q1, w) ∩ F ≠ ∅). So while processing a string in Σ*, we need to keep track of all the possible paths.Example-1: to process string
abab
though above NFS:Above diagram show: How to process a string
abab
in NFA?A halt: means string could not process completely so it can't be consider a accepted string in this path
String
abab
could process completely in two directions so δ*(q1, w) = { q1, q3}.and intersection of δ*(q1, w) with set of final states is {q3}:
In this way, string
ababa
is in language L(nfa).Example-2: String from Σ* is
abba
and following is how to process:For string
abba
set of reachable states is δ*(q1, w) = { q1, q2} and no state is final state in this set this implies => its intersection with F is ∅ a empty set, hence stringabba
is not an accepted string (and not in language).This is the way we process a string in Non-deterministic Finite Automata.
Some additional important notes:
In case of finite automata's both Deterministic and Non-Deterministic models are equally capable. Non-Deterministic model doesn't have extra capability to define a language. hence scope of NFA and DFA are same that is Regular Language. (this is not case for all class of automate for example scope of PDA !=NPDA)
Non-deterministic models are more useful for theoretical purpose, comparatively essay to draw. Whereas for implementation purpose we always desire deterministic model (minimized for efficiency). And fortunately in class of finite autometa every Non-deterministic model can be converted into an equivalent Deterministic one. We have algorithmic method to convert an NFA into DFA.
An information represented by a single state in DFA, can be represented by a combination of NFA states, hence number of states in NFA are less than their equivalent DFA. (proof are available numberOfStates(DFA)<= 2 power numberOfStates(NFA) as all set combinations are powerset)
The DFA for above regular language is as below:
Using this DFA you will always find a unique path from initial state to final state for any string in Σ* and instead of set you will gets to a single reachable final state and if that state belongs to set of final that input string is said to be accepted string (in language) otherwise not/
(your expression
.*ab
and(a + b)*ab
are same usually in theoretical science we don't use.
dot operator other then concatenation)