I need help modifying a regular expression for PHP

2019-05-27 00:45发布

问题:

I'm modifying PHP Markdown (a PHP parser of the markup language which is used here on Stack Overflow) trying to implement points 1, 2 and 3 described by Jeff in this blog post. I've easily done the last two, but this one is proving very difficult:

  1. Removed support for intra-word emphasis like_this_example

In fact, in the "normal" markdown implementation like_this_example would be rendered as likethisexample. This is very undesirable; I want only _example_ to become example.

I looked in the source code and found the regex used to do the emphasis:

var $em_relist = array(
    ''  => '(?:(?<!\*)\*(?!\*)|(?<!_)_(?!_))(?=\S|$)(?![.,:;]\s)',
    '*' => '(?<=\S|^)(?<!\*)\*(?!\*)',
    '_' => '(?<=\S|^)(?<!_)_(?!_)',
    );
var $strong_relist = array(
    ''   => '(?:(?<!\*)\*\*(?!\*)|(?<!_)__(?!_))(?=\S|$)(?![.,:;]\s)',
    '**' => '(?<=\S|^)(?<!\*)\*\*(?!\*)',
    '__' => '(?<=\S|^)(?<!_)__(?!_)',
    );
var $em_strong_relist = array(
    ''    => '(?:(?<!\*)\*\*\*(?!\*)|(?<!_)___(?!_))(?=\S|$)(?![.,:;]\s)',
    '***' => '(?<=\S|^)(?<!\*)\*\*\*(?!\*)',
    '___' => '(?<=\S|^)(?<!_)___(?!_)',
    );

I tried to open it in Regex Buddy but it wasn't enough, and after spending half an hour working on it I still don't know where to start. Any suggestions?

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

回答1:

I was able to grab only individual _enclosed_ words via:

$input = 'test of _this_ vs stuff_like_this...and here is _anothermatch_ and_another_fake_string';
$pattern = '#(?<=\s|^)(?<!_)(_[^_]*_)(?!_)#is';
preg_match_all($pattern, $input, $matches);
print_r($matches);

I'm not sure how exactly that would fit into the above code though. You would probably need to pair it with the other patterns below to account for the two and three match situations:

$pattern = '#(?<=\s|^)(?<!_)(__[^_]*__)(?!_)#is';
$pattern = '#(?<=\s|^)(?<!_)(___[^_]*___)(?!_)#is';


回答2:

I use RegexBuddy too. :)

You may want to try the following code:

<?php

$line1 = "like_this_example";
$line2 = "I want only _example_ to become example";
$pattern = '/\b_(?P<word>.*?)_\b/si';

if (preg_match($pattern, $line1, $matches))
{
  $result = $matches['word'];
  var_dump($result);
}

if (preg_match($pattern, $line2, $matches))
{
  $result = $matches['word'];
  var_dump($result);
}

?>