Could someone explain these two terms in an understandable way?
相关问题
- Improve converting string to readable urls
- Regex to match charset
- Regex subsequence matching
- Accommodate two types of quotes in a regex
- Set together letters and numbers that are ordinal
相关文章
- Optimization techniques for backtracking regex imp
- Regex to check for new line
- Allow only 2 decimal points entry to a textbox usi
- Comparing speed of non-matching regexp
- Regular expression to get URL in string swift with
- 请问如何删除之前和之后的非字母中文单字
- Lazy (ungreedy) matching multiple groups using reg
- when [:punct:] is too much [duplicate]
Greedy matching. The default behavior of regular expressions is to be greedy. That means it tries to extract as much as possible until it conforms to a pattern even when a smaller part would have been syntactically sufficient.
Example:
Instead of matching till the first occurrence of ‘>’, it extracted the whole string. This is the default greedy or ‘take it all’ behavior of regex.
Lazy matching, on the other hand, ‘takes as little as possible’. This can be effected by adding a
?
at the end of the pattern.Example:
If you want only the first match to be retrieved, use the search method instead.
Source: Python Regex Examples
Greedy means it will consume your pattern until there are none of them left and it can look no further.
Lazy will stop as soon as it will encounter the first pattern you requested.
One common example that I often encounter is
\s*-\s*?
of a regex([0-9]{2}\s*-\s*?[0-9]{7})
The first
\s*
is classified as greedy because of*
and will look as many white spaces as possible after the digits are encountered and then look for a dash character "-". Where as the second\s*?
is lazy because of the present of*?
which means that it will look the first white space character and stop right there.As far as I know, most regex engine is greedy by default. Add a question mark at the end of quantifier will enable lazy match.
As @Andre S mentioned in comment.
Refer to the example below for what is greedy and what is lazy.
The result is:
I'm greeedy and I want 100000000 dollars. This is the most I can get.
I'm too lazy to get so much money, only 100 dollars is enough for me
Greedy will consume as much as possible. From http://www.regular-expressions.info/repeat.html we see the example of trying to match HTML tags with
<.+>
. Suppose you have the following:You may think that
<.+>
(.
means any non newline character and+
means one or more) would only match the<em>
and the</em>
, when in reality it will be very greedy, and go from the first<
to the last>
. This means it will match<em>Hello World</em>
instead of what you wanted.Making it lazy (
<.+?>
) will prevent this. By adding the?
after the+
, we tell it to repeat as few times as possible, so the first>
it comes across, is where we want to stop the matching.I'd encourage you to download RegExr, a great tool that will help you explore Regular Expressions - I use it all the time.
Taken From www.regular-expressions.info
Greediness: Greedy quantifiers first tries to repeat the token as many times as possible, and gradually gives up matches as the engine backtracks to find an overall match.
Laziness: Lazy quantifier first repeats the token as few times as required, and gradually expands the match as the engine backtracks through the regex to find an overall match.
Example:
test string : stackoverflow
greedy reg expression :
s.*o
output: stackoverflowlazy reg expression :
s.*?o
output: stackoverflow