This question already has an answer here:
I've been getting an error message for the following piece of code (I'm trying to get the content inside the 'article' tags on a certain web page):
function getTextFromLink($url) {
$html = new DOMDocument();
$html->loadHTML($url);
$text = $html->getElementsByTagName('article')->item(0)->textContent;
return $text;
}
It says that I'm trying to get the property of a non-object on the line with
$text = $html->getElementsbyTagName('article')->item(0)->textContent;
I'm fairly new to php and DOM; what am I missing here?
You have two problems in your code:
The obvious problem is that
$html->getElementsByTagName('article')->item(0)
is not an object. Specifically, it is null, since the HTML you're parsing doesn't actually contain anyarticle
elements. You could've figured this out yourself by following Devon's advice and viewing the value of$html->getElementsByTagName('article')->item(0)
usingvar_dump()
.Now, why doesn't your HTML contain any
article
elements? Well, the real problem turns out to be that theloadHTML()
method will load HTML from a string and parse it. That is to say, when you call$html->loadHTML($url);
, PHP will parse the contents of the string variable$url
as HTML code, and give you a DOMDocument representing the result. Given that you named the variable$url
, I'm pretty sure that's not what you want.What you actually want to use instead is probably
loadHTMLFile()
, which actually loads HTML code from a named file (or, apparently, URL), rather than from a PHP string.