Here is what I need to be able to do:
I need to match the following tag:
<SPAN style="TEXT-DECORATION: underline">text sample</SPAN>
I need to replace the span with an html3 compliant tag, but keep the text in between. The final tag should look like this after replacement:
<u>text sample</u>
I'm just not good with regular expressions and can't seem to come up with the answer.
Thank you in advance.
DO NOT USE REGULAR EXPRESSIONS TO PARSE HTML
do not use regular expressions to parse HTML
do not use regular expressions to parse HTML
do not use regular expressions to parse HTML
do not use regular expressions to parse HTML
do not use regular expressions to parse HTML
do you need more clarification?
Use DomDocument::LoadFromHTML ;)
For the basic example that you've given.
will do the trick. The pattern regex is quite easy - it's exactly what you're looking for (with quotes and '/' escaped) with a (.+?) which says to include all possible characters until the close of the SPAN tag. This assumes that you're code is consistently formatted, you could append a 'i' to the end of $pattern to make it case-insensitive.
Note that this isn't really the right way of doing it.
You'll need several lines like this:
etc. Although if there's any possibility that the tags won't exactly match those regular expressions (which is usually the case, except for very simple machine-generated HTML), doing this with regular expressions becomes fiendishly complicated, and you'd be better off using a parser of some kind.
Regular expressions are not designed for tag manipulation.
If you have any form of nesting going on, it gets messy.
However, given the very simple example provided you could perhaps do this:
But this is flawed in many ways, and you are much better off using a tool designed for manipulating tags instead.
Have a look at DOMDocument->loadHTML() and related functions.