I would like to create a page where all images which reside on my website are listed with title and alternative representation.
I already wrote me a little program to find and load all HTML files, but now I am stuck at how to extract src
, title
and alt
from this HTML:
<img src="/image/fluffybunny.jpg" title="Harvey the bunny" alt="a cute little fluffy bunny" />
I guess this should be done with some regex, but since the order of the tags may vary, and I need all of them, I don't really know how to parse this in an elegant way (I could do it the hard char by char way, but that's painful).
You may use simplehtmldom. Most of the jQuery selectors are supported in simplehtmldom. An example is given below
I used preg_match to do it.
In my case, I had a string containing exactly one
<img>
tag (and no other markup) that I got from Wordpress and I was trying to get thesrc
attribute so I could run it through timthumb.In the pattern to grab the title or the alt, you could simply use
$pattern = '/title="([^"]*)"/';
to grab the title or$pattern = '/title="([^"]*)"/';
to grab the alt. Sadly, my regex isn't good enough to grab all three (alt/title/src) with one pass though.Here is THE solution, in PHP:
Just download QueryPath, and then do as follows:
That's it, you're done !
Here's A PHP Function I hobbled together from all of the above info for a similar purpose, namely adjusting image tag width and length properties on the fly ... a bit clunky, perhaps, but seems to work dependably:
There is my solution for retriving only images from the content of any post in wordpress or html content. `
`