This question already has answers here:
Closed 5 years ago.
I'm importing an RSS feed with SimpleXMLElement
in PHP. I'm having trouble with the title and description. For some reason, the website I get the feed from puts the title and description in <![CDATA[...]]>
:
<item>
<title><![CDATA[...title...]]></title>
<link>...url...</link>
<description><![CDATA[...title...]]></description>
<pubDate>...date...</pubDate>
<guid>...link...</guid>
</item>
When I do a var_dump()
on the SimpleXMLElement, I get (for this part):
[2]=>
object(SimpleXMLElement)#5 (5) {
["title"]=>
object(SimpleXMLElement)#18 (0) {
}
["link"]=>
string(95) "...link..."
["description"]=>
object(SimpleXMLElement)#19 (0) {
}
["pubDate"]=>
string(31) "...date..."
["guid"]=>
string(48) "...link..."
}
How can I get the value in <![CDATA[...]]>
to read the title and description from the feed?
SimpleXML reads CDATA nodes absolutely fine. The only problem you're having is that print_r
, var_dump
, and similar functions don't give an accurate representation of SimpleXML objects, because they are not implemented fully in PHP.
If you run echo $myNode->description
you will see the content of the CDATA section just fine. The reason is that when you ask for a SimpleXMLElement to be converted to a string, it automatically combines all the text and CDATA content for you - but until you do, it remembers the distinction.
As a general case, to extract the string content of any element or attribute in SimpleXML, cast to string with (string)$myNode
. This also prevents other issues, such as functions complaining about getting an object when they were expecting a string, or failure to serialize when saving to a session.
See also my previous answer at https://stackoverflow.com/a/13830559/157957