I'm building a chatbot subset of RiveScript and trying to build the pattern matching parser with regular expression. Which three regexes match the following three examples?
ex1: I am * years old
valid match:
- "I am 24 years old"
invalid match:
- "I am years old"
ex2: what color is [my|your|his|her] (bright red|blue|green|lemon chiffon) *
valid matches:
- "what color is lemon chiffon car"
- "what color is my some random text till the end of string"
ex3: [*] told me to say *
valid matches:
- "Bob and Alice told me to say hallelujah"
- "told me to say by nobody"
The wildcards mean any text that is not empty is acceptable.
In example 2, anything between [ ]
is optional, anything between ( )
is alternative, each option or alternative is separated by a |
.
In example 3, the [*]
is an optional wildcard, meaning blank text can be accepted.
https://regex101.com/r/CuZuMi/4
https://regex101.com/r/CuZuMi/2
https://regex101.com/r/CuZuMi/3
I am using mostly 2 things:
(?:)
non-capture groups, to group things together like the parenthesis use on math..*
match any character 0 or more times. Could be replaced by{1,3}
to match between 1 and 3 times.You can exchange
*
by+
to match at least 1 character, instead of 0. And the?
after the non-capture group, makes that group optional.These are golden place for you to start: