Regular expression to match a block of text up to

I'm making a simple Textile parser and am trying to write a regular expression for "blockquote" but am having difficulty matching multiple new lines. Example:

bq. first line of quote
second line of quote
third line of quote

not part of the quote

It will be replaced with blockquote tags via preg_replace() so basically it needs to match everything between "bq." and the first double new line it comes across. The best I can manage is to get the first line of the quote. Thanks

标签： php regex textile

5条回答

地球回转人心会变

2楼-- · 2019-01-20 13:12

This accepted answer only captured the last character of the block for me. I ended up using this:

$text =~ /(?s)bq\.(.+?)\n\n/g

0人赞添加讨论(0) 举报

放荡不羁爱自由

3楼-- · 2019-01-20 13:16

Edit: Ehr, misread the question.. "bq." was significant.

echo preg_replace('/^bq\.(.+?)\n\n/s', '<blockquote>$1</blockquote>', $str, 1);

Sometimes data that is entered via webforms contains \r\n instead of just \n which would make it

echo preg_replace('/^bq\.(.+?)\r\n\r\n/s', '<blockquote>$1</blockquote>', $str, 1);

The questionmark makes it add the closing blockquotes after the first double return found ("non-greedy" I believe it's called), so any other double returns are left alone (if that is not what you want, take it out obviously).

0人赞添加讨论(0) 举报

劫难

4楼-- · 2019-01-20 13:17

Try this regex:

(?s)bq\.((?!(\r?\n){2}).)*+

meaning:

(?s)           # enable dot-all option
b              # match the character 'b'
q              # match the character 'q'
\.             # match the character '.'
(              # start capture group 1
  (?!          #   start negative look ahead
    (          #     start capture group 2
      \r?      #       match the character '\r' and match it once or none at all
      \n       #       match the character '\n'
    ){2}       #     end capture group 2 and repeat it exactly 2 times
  )            #   end negative look ahead
  .            #   match any character
)*+            # end capture group 1 and repeat it zero or more times, possessively

The \r?\n matches a Windows, *nix and (newer) MacOS line breaks. If you need to account for real old Mac computers, add the single \r to it: \r?\n|\r

0人赞添加讨论(0) 举报

家丑人穷心不美

5楼-- · 2019-01-20 13:21

Would this work?

'/(.+)\n\n/s'

I believe 's' stands for single line.

0人赞添加讨论(0) 举报

一纸荒年 Trace。

6楼-- · 2019-01-20 13:29

My instincts tell me something like...

preg_match("/^bq\. (.+?)\n\n/s", $input, $matches)

Just like the above fella says, the s flag after the / at the end of the RegEx means that the . will match new line characters. Usually, without this, RegExs are kind of a one line thing.

Then the question mark ? after the .+ denotes a non-greedy match so that the .+ won't match as it can; instead it will match the minimum possible, so that the \n\n will match the first available double line.

To what extent are you planning on supporting features of Textile? Because your RegEx can get pretty complicated, as Textile allows things like...

bq.. This is a block quote

This is still a block quote

or...

bq(funky). This is a block quote belonging to the class funky!

bq{color:red;}. Block quote with red text!

All of which your regex-replace technique won't be able to handle, methinks.

0人赞添加讨论(0) 举报

Regular expression to match a block of text up to

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间