I'm using sublime text 2 editor. I would like to use regex to match all character between all h1
tags.
As of now i'm using like this
<h1>.+</h1>
Its working fine if the h1 tag doesn't have breaks.
I mean for
<h1>Hello this is a hedaer</h1>
its working fine.
But its not working if the tag look like this
<h1>
Hello this is a hedaer
</h1>
Can someone help me with the syntax?
By default .
matches every character except new line character.
In this case, you will need DOTALL option, which will make .
matches any character, including new line character. DOTALL option can be specified inline as (?s)
. For example:
(?s)<h1>.+</h1>
However, you will see that it will not work, since the default behavior of the quantifier is greedy (in this case its +
), which means that it will try to consume as many characters as possible. You will need to make it lazy (consume as few characters as possible) by adding extra ?
after the quantifier +?
:
(?s)<h1>.+?</h1>
Alternatively, the regex can be <h1>[^<>]*</h1>
. In this case, you don't need to specify any option.
Since this question is the top Google results search for a regex trying to find all the characters between an h1 tag I thought I would give that answer as well. Since that was what I was looking for.
(?s)(?<=<h1>)(.+?)(?=</h1>)
That regex, if used on a sample text like <h1>A title</h1> <p>Some content</p> <h1>Another title</h1>
will only return A title
.