SimpleXML keeps returning content on CDATA element

2020-05-08 17:46发布

问题:

So another CDATA returning content question. I've seen many answers, but even though I tried them all, I still get only content.

In more details:

I have an xml file (containing many NewsItem inside):

<NewsML>
<NewsItem>    
    <NewsComponent>               
        <ContentItem>                      
            <DataContent>                
                <body>                    
                    <body.content>                        
                        <![CDATA[<p>This is what I am trying to retrieve</p>]]>
                    </body.content>
                </body>
            </DataContent>
        </ContentItem>
    </NewsComponent>
</NewsItem>

I am trying to get the content of body.content.

Here is my code:

$xml = simplexml_load_file('path/to/my/xml.xml',null,LIBXML_NOCDATA);

if(count($xml->children()) > 0){
    foreach($xml->children() as $element){
        $description = (string)$element->NewsComponent->ContentItem->DataContent->body->body.content;
        echo $description;
    }
}
echo '<pre>';
print_r($xml);
echo '</pre>';

My echo returns: content

even though I do see the content in the print_r of my xml, as we can see here:

SimpleXMLElement Object
(
    [NewsItem] => Array
    (
        [0] => SimpleXMLElement Object
            (

                [NewsComponent] => SimpleXMLElement Object
                    (                            

                        [ContentItem] => Array
                            (
                                [0] => SimpleXMLElement Object
                                    (

                                        [DataContent] => SimpleXMLElement Object
                                            (
                                                [body] => SimpleXMLElement Object
                                                    (
                                                        [body.content] => This is what I am trying to retieve

                                                    )

                                            )

                                    )

                            )

                    )

            )

I tried using (string) or not on the element.

I also tried using

$xml = simplexml_load_file('path/to/my/xml.xml',null,LIBXML_NOCDATA);
vs
$xml = simplexml_load_file('path/to/my/xml.xml',"SimpleXMLElement",LIBXML_NOCDATA);
vs
$xml = simplexml_load_file('path/to/my/xml.xml');

回答1:

For element names which cannot be PHP identifiers (like body.content), you must use an alternative PHP notation:

$element->NewsComponent->ContentItem->DataContent->body->{'body.content'};


回答2:

I think your example returns 'content' because you are concatenating an element that does not exist

$element->NewsComponent->ContentItem->DataContent->body->body

with the string 'content' - probably PHP complains that there's no constant with the name content and therefore assumes you meant 'content'.

Thus my guess is you need to find another way to select an element with a dot in the name.

(This problem does not appear to be related to CDATA.)