Can someone help me get this function to work? The function should accept $HTMLstr
-- a whole page of HTML stuffed into a string that already contains a meta description in the form of:
<meta name="description" content="This will be replaced"/>
along with $content
which is the string that should replace "This will be replaced". I thought I was close with this function, but it doesn't quite work.
function HTML_set_meta_description ($HTMLstr, $content) {
$newHTML = preg_replace('/<meta name="description"(.*)"\/>/is', "<meta name=\"description\" content=\"$content\"/>", $HTMLstr);
return ($newHTML);
}
Thanks for any help!
Edit: Here's the working function.
function HTML_set_meta_description ($HTMLstr, $content) {
// assumes meta format is exactly <meta name="description" content="This will be replaced"/>
$newHTML = preg_replace('/<meta name="description" content="(.*)"\/>/i','<meta name="description" content="' . $content . '" />', $HTMLstr);
return ($newHTML);
}
Using DOMDocument
is recommended as already an answer, however if you're struggling with a regular expression, then I might help you out. You might try this instead:
return preg_replace('/<meta name="description" content="(.*)"\/>/i','<meta name="description" content="Something replaced" />', $HTMLstr);
Unless you know that the <meta>
will be provided in a consistent format (which is difficult to know unless you actually have control over the HTML) you will have a very tough time constructing a working regex. Take these examples:
<meta content="content" name="description">
<meta content = 'content' name = 'description' />
<meta name= 'description' content ="content"/>
These are all valid, but the regex that would handle them would be very complex. Something like:
@<meta\s+name\s*=\s*('|")description\1\s+content\s*('|")(.*?)\2\s+/?>@
...and that doesn't even account for the attributes being in a different order. There may have been something else I didn't think of as well.
On the other hand using a parser such as DOMDocument may be very expensive, especially if your HTML is large. If you can depend on a consistent format for the <meta>
you want to use .*?
instead of .*
to capture the content. .*?
makes the search reluctant so it will stop at the first quote as opposed to the last -- there are likely to be many other quotes throughout the HTML document.
$dom = new DOMDocument;
$dom->loadHTML($HTMLstr);
foreach ($dom->getElementsByTagName("meta") as $tag) {
if (stripos($tag->getAttribute("name"), "description") !== false) {
$tag->setAttribute("content", $content);
}
}
return $dom->saveHTML();
I know you asked preg_replace and im late to answer but look at this, is it that you are looking for...
<?php
function meta_desc( $content = null ){
$desc = 'This will be replaced ';
if( $content ){
$desc = $content;
}
return '<meta name="description"
content=" '. $desc .' "/>';
}
?>
Trust me its faster than that. I think you should use this function.