PHP DOMDocument : How to parse xml/rss Tags with C

I have the below RSS to parse, something like:

<?xml version="1.0" encoding="utf-8"?>
<rss xmlns:x-wr="http://www.w3.org/2002/12/cal/prod/Apple_Comp_628d9d8459c556fa#" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:x-example="http://www.example.com/rss/x-example" xmlns:x-microsoft="http://schemas.microsoft.com/x-microsoft" xmlns:xCal="urn:ietf:params:xml:ns:xcal" version="2.0">
    <channel>
        <item>
            <title>About Apples</title>
            <author>David K. Lowie</title>
            <x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
            <x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>
        </item>
        <item>
            <title>About Oranges</title>
            <author>Marry L. Jones</title>
            <x-trumba:customfield name="description">This is the description about oranges</xCal:customfield>
            <x-trumba:customfield name="category">Fruits,Food,Orange</xCal:customfield>
        </item>
    </channel>
</rss>

In PHP, I only know how to read first two nodes, something like:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );

foreach( $rss->getElementsByTagName("item") as $node ) {
    echo $node->getElementsByTagName("title")->item(0)->nodeValue,
    echo $node->getElementsByTagName("author")->item(0)->nodeValue,
}

But, these ones are the problems:

<x-trumba:customfield name="description">This is the description about apples</xCal:customfield>
<x-trumba:customfield name="category">Fruits,Food,Apple</xCal:customfield>

Please help:

How to parse the last nodes like <x-trumba:customfield name="description"> ?

(I can't change the RSS source since it's not under my control.)

Please kindly help.

You XML is invalid, the 'x-trumba' prefix is not defined, and the closing tags of the elements use the 'xCal' prefix, refering to urn:ietf:params:xml:ns:xcal.

So replacing the prefix of the opening tags with 'xCal' and fixing the closing tags for 'author' makes the XML valid.

Then it is possible to register the xCalendar namespace and use Xpath to fetch the custom field contents:

$rss = new DOMDocument();
$rss->load( "http://www.example.com/books.rss" );
$xpath = new DOMXpath($rss);
$xpath->registerNamespace('x', 'urn:ietf:params:xml:ns:xcal');

foreach( $xpath->evaluate("//item") as $item ) {
    echo $xpath->evaluate('string(title)', $item), "\n";
    echo $xpath->evaluate('string(x:customfield[@name="description"])', $item), "\n";
}

Output:

About Apples
This is the description about apples
About Oranges
This is the description about oranges

The Xpath expression use a condition ([@name="description"]) to filter the customfield element nodes.

PHP DOMDocument : How to parse xml/rss Tags with C

问题:

回答1:

收藏的人(0)

PHP DOMDocument : How to parse xml/rss Tags with C

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮