I am trying to parse a string of HTML tag attributes in php. There can be 3 cases:
attribute="value" //inside the quotes there can be everything also other escaped quotes
attribute //without the value
attribute=value //without quotes so there are only alphanumeric characters
can someone help me to find a regex that can get in the first match the attribute name and in the second the attribute value (if it's present)?
Give this a try and see if it is what you want to extract from the tags.
preg_match_all('/( \\w{1,}="\\w{1,}"| \\w{1,}=\\w{1,}| \\w{1,})/i',
$content,
$result,
PREG_PATTERN_ORDER);
$result = $result[0];
The regex pulls each attribute, excludes the tag name, and puts the results in an array so you will be able to loop over the first and second attributes.
Never ever use regular expressions for processing html, especially if you're writing a library and don't know what your input will look like. Take a look at simplexml, for example.