This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How to extract img src, title and alt from html using php?
Hi,
I have found solution to get first image from string:
preg_match('~<img[^>]*src\s?=\s?[\'"]([^\'"]*)~i',$string, $matches);
But I can't manage to get all images from string.
One more thing... If image contains alternative text (alt
attribute) how to get it too and save to another variable?
Thanks in advance,
Ilija
Don't do this with regular expressions. Instead, parse the HTML. Take a look at Parse HTML With PHP And DOM. This is a standard feature in PHP 5.2.x (and probably earlier). Basically the logic for getting images is roughly:
$dom = new domDocument;
$dom->loadHTML($html);
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
foreach ($images as $image) {
echo $image->getAttribute('src');
}
This should be trivial to adapt to finding images.
This is what I tried but can't get it print value of src
$dom = new domDocument;
/*** load the html into the object ***/
$dom->loadHTML($html);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
/*** the table by its tag name ***/
$images = $dom->getElementsByTagName('img');
/*** loop over the table rows ***/
foreach ($images as $img)
{
/*** get each column by tag name ***/
$url = $img->getElementsByTagName('src');
/*** echo the values ***/
echo $url->nodeValue;
echo '<hr />';
}
EDIT: I solved this problem
$dom = new domDocument;
/*** load the html into the object ***/
$dom->loadHTML($string);
/*** discard white space ***/
$dom->preserveWhiteSpace = false;
$images = $dom->getElementsByTagName('img');
foreach($images as $img)
{
$url = $img->getAttribute('src');
$alt = $img->getAttribute('alt');
echo "Title: $alt<br>$url<br>";
}
Note that Regular Expressions are a bad approach to parsing anything that involves matching braces.
You'd be better off using the DOMDocument class.
You assume that you can parse HTML using regular expressions. That may work for some sites, but not all sites. Since you are limiting yourself to only a subset of all web pages, it would be interesting to know how you limit yourself... maybe you can parse the HTML in a quite easy way from php.
Look at preg_match_all to get all matches.