I would like to create a page where all images which reside on my website are listed with title and alternative representation.
I already wrote me a little program to find and load all HTML files, but now I am stuck at how to extract src
, title
and alt
from this HTML:
<img src="/image/fluffybunny.jpg" title="Harvey the bunny" alt="a cute little fluffy bunny" />
I guess this should be done with some regex, but since the order of the tags may vary, and I need all of them, I don't really know how to parse this in an elegant way (I could do it the hard char by char way, but that's painful).
for one element you can use this minified solution using DOMDocument. Handles both ' and " quotes and also validates the html. Best practice is to use existing libraries rather than your own solution using regular expressions.
If it's XHTML, your example is, you need only simpleXML.
Output:
You can write a regexp to get all img tags (
<img[^>]*>
), and then use simple explode:$res = explode("\"", $tags)
, the output will be something like this:If you delete the
<img
tag before the explode, then you will get an array in the form ofso the order of the properties are irrelevant, you only use what you will like.
Maybe this will give you the right answers :
RE this solution:
How do you get the tag and attribute from multiple files/urls?
Doing this didn't work for me: