I need to look inside a string of HTML and change all <img>
tags where the src
attribute is a relative address, to an absolute URL. So this:
<img src="puppies.jpg">
needs to become:
<img src="http://sitename.com/path/puppies.jpg">
while ignoring <img>
tags whose src
attribute is already absolute.
I'm using PHP and assume that I'll need to run this through preg_replace()
. Help! And Thanks!
This is not a job for a regular expression. It's a job for an XML/DOM parser.
I'd give DOMDocument a shot.
$DOM = new DOMDocument;
$DOM->loadHTML($html);
$imgs = $DOM->getElementsByTagName('img');
foreach($imgs as $img){
$src = $img->getAttribute('src');
if(strpos($src, 'http://sitename.com/path/') !== 0){
$img->setAttribute('src', "http://sitename.com/path/$src");
}
}
$html = $DOM->saveHTML();
This is not a job for a regular expression. It's a job for an XML/DOM
parser.
Nope it's not. If you just want to add a prefix to each src attribute, it's best to use simple string functions and don't even think about xml, regex or dom parsing…
$str = str_replace('<img src="', '<img src="http://prefix', $str);
You can clean up wrong links (already absolute ones) afterwards
$str = str_replace('<img src="http://prefixhttp://', '<img src="http://', $str);
Do not blow up your code with regexp/dom if you can avoid it.