Regex for html attributes in php

2019-03-04 06:15发布

问题:

I am trying to parse a string of HTML tag attributes in php. There can be 3 cases:

attribute="value"  //inside the quotes there can be everything also other escaped quotes
attribute          //without the value
attribute=value    //without quotes so there are only alphanumeric characters

can someone help me to find a regex that can get in the first match the attribute name and in the second the attribute value (if it's present)?

回答1:

Give this a try and see if it is what you want to extract from the tags.

preg_match_all('/( \\w{1,}="\\w{1,}"| \\w{1,}=\\w{1,}| \\w{1,})/i', 
    $content, 
    $result, 
    PREG_PATTERN_ORDER);
$result = $result[0];

The regex pulls each attribute, excludes the tag name, and puts the results in an array so you will be able to loop over the first and second attributes.



回答2:

Never ever use regular expressions for processing html, especially if you're writing a library and don't know what your input will look like. Take a look at simplexml, for example.