Regex select all text between tags

2019-01-04 07:05发布

What is the best way to select all the text between 2 tags - ex: the text between all the 'pre' tags on the page.

14条回答
一纸荒年 Trace。
2楼-- · 2019-01-04 07:38
<pre>([\r\n\s]*(?!<\w+.*[\/]*>).*[\r\n\s]*|\s*[\r\n\s]*)<code\s+(?:class="(\w+|\w+\s*.+)")>(((?!<\/code>)[\s\S])*)<\/code>[\r\n\s]*((?!<\w+.*[\/]*>).*|\s*)[\r\n\s]*<\/pre>
查看更多
Summer. ? 凉城
3楼-- · 2019-01-04 07:41

You can use "<pre>(.*?)</pre>", (replacing pre with whatever text you want) and extract the first group (for more specific instructions specify a language) but this assumes the simplistic notion that you have very simple and valid HTML.

As other commenters have suggested, if you're doing something complex, use a HTML parser.

查看更多
Emotional °昔
4楼-- · 2019-01-04 07:43

To exclude the delimiting tags:

"(?<=<pre>)(.*?)(?=</pre>)"
查看更多
放我归山
5楼-- · 2019-01-04 07:46

This seems to be the simplest regular expression of all that I found

(?:<TAG>)([\s\S]*)(?:<\/TAG>)
  1. Exclude opening tag (?:<TAG>) from the matches
  2. Include any whitespace or non-whitespace characters ([\s\S]*) in the matches
  3. Exclude closing tag (?:<\/TAG>) from the matches
查看更多
劫难
6楼-- · 2019-01-04 07:49

Try this....

(?<=\<any_tag\>)(\s*.*\s*)(?=\<\/any_tag\>)
查看更多
ら.Afraid
7楼-- · 2019-01-04 07:51

I use this solution:

preg_match_all( '/<((?!<)(.|\n))*?\>/si',  $content, $new);
var_dump($new);
查看更多
登录 后发表回答