Regex extract variables from [shortcode]

2019-01-19 08:27发布

问题:

After migrating some content from WordPress to Drupal, I've got som shortcodes that I need to convert:

String content:

Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.

I need to find all variables within the shortcode [sublimevideo ...] and turn it into an array:

Array (
    class => "sublime"
    poster => "http://video.host.com/_previews/600x450/sbx-60025-00-da-FMT.png"
    src1 => "http://video.host.com/_video/H.264/LO/sbx-60025-00-da-FMT.m4v"
    src2 => "(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-FMT.m4v"
    width => "560"
    height => "315"
)

And preferably handle multiple instances of the shortcode.

I guess it can be done with preg_match_all() but I've had no luck.

回答1:

This will give you what you want.

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';

$dat = array();
preg_match("/\[sublimevideo (.+?)\]/", $data, $dat);

$dat = array_pop($dat);
$dat= explode(" ", $dat);
$params = array();
foreach ($dat as $d){
    list($opt, $val) = explode("=", $d);
    $params[$opt] = trim($val, '"');
}

print_r($params);

In anticipation of the next challenge you will face with processing short codes you can use preg_replace_callback to replace the short tag data with it's resultant markup.

$data = 'Irrelevant tekst... [sublimevideo class="sublime" poster="http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png" src1="http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v" src2="(hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v" width="560" height="315"] ..more irrelevant text.';

function processShortCode($matches){
    // parse out the arguments
    $dat= explode(" ", $matches[2]);
    $params = array();
    foreach ($dat as $d){
        list($opt, $val) = explode("=", $d);
        $params[$opt] = trim($val, '"');
    }
    switch($matches[1]){
        case "sublimevideo":
            // here is where you would want to return the resultant markup from the shorttag call.
             return print_r($params, true);        
    }

}
$data = preg_replace_callback("/\[(\w+) (.+?)]/", "processShortCode", $data);
echo $data;


回答2:

You could use the following RegEx to match the variables:

$regex = '/(\w+)\s*=\s*"(.*?)"/';

I would suggest to first match the sublimevideo shortcode and get that into a string with the following RegEx:

$pattern = '/\[sublimevideo(.*?)\]/';

To get the correct array keys I used this code:

// $string is string content you specified
preg_match_all($regex, $string, $matches);

$sublimevideo = array();
for ($i = 0; $i < count($matches[1]); $i++)
    $sublimevideo[$matches[1][$i]] = $matches[2][$i];

This returns the following array: (the one that you've requested)

Array
(
    [class] => sublime
    [poster] => http://video.host.com/_previews/600x450/sbx-60025-00-da-ANA.png
    [src1] => http://video.host.com/_video/H.264/LO/sbx-60025-00-da-ANA.m4v
    [src2] => (hd)http://video.host.com/_video/H.264/HI/sbx-60025-00-da-ANA.m4v
    [width] => 560
    [height] => 315
)


回答3:

As described in this answer, I'd suggest letting WordPress do the work for you using the get_shortcode_regex() function.

 $pattern = get_shortcode_regex();
 preg_match_all("/$pattern/",$wp_content,$matches);

This will give you an array that is easy to work with and shows the various shortcodes and affiliated attributes in your content. It isn't the most obvious array format, so print it and take a look so you know how to manipulate the data you need.