I am going through exercises regarding regular expressions, and I am really unsure on how to do this.
The regular expression is:
((a*)(b*))* ∪ (a*)
I am really bad at this, but I think that ((a*)(b*))*
can be simplified to (a ∪ b)*
But if this is right, than the last ∪ (a*)
is really just a repetition, so I figure the whole expression can be simplified to (a ∪ b)*.
Does this seem correct?
Edit: ∪ stands for union
What
((a*)(b*))*U(a*)
really means is (copied from here)This expression currently matches all of these sequences:
abUa bU U aabbUaa aaUaa aaU Uaa bbU ababUaa aabbaabbUaa
(look at here)There is no way to simplify this, without removing capturing groups and remaining order of letters.
EDIT: If
U
in your regex statement stands for "union", then this expression is invalid. There is no way to union anything in regex. There is onlyOR
and you need to use|
(pipe) for that. If you want to union((a*)(b*))*
and(a*)
then probably it will be((a*)(b*))*
, but it will still match anything likeabaab
.Still, capturing groups in your regex statement are useless, so something like
[ab]*
is enough to match any number ofa
's andb
's.You are right.
(a*b*)*
can match any string of a's and b's, so can(a U b)*
, therefore they are equivalent.(a U b)*
intersecta*
isa*
soa*
is a subset of(a U b)*
. Consequently, the whole expression can be simplified to(a U b)*
.