PHP DOMDocument how to get element?

I am trying to read a website's content but i have a problem i want to get images, links these elements but i want to get elements them selves not the element content for instance i want to get that: i want to get that entire element.

How can i do this..

<?php

    $ch = curl_init();

    curl_setopt($ch, CURLOPT_URL, "http://www.link.com");
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);

    $output = curl_exec($ch);

    $dom = new DOMDocument;
    @$dom->loadHTML($output);

    $items = $dom->getElementsByTagName('a');

    for($i = 0; $i < $items->length; $i++) {
        echo $items->item($i)->nodeValue . "<br />";
    }

    curl_close($ch);;
?>

标签： php html parsing curl domdocument

2条回答

爷的心禁止访问

2楼-- · 2020-04-19 07:02

I'm assuming you just copy-pasted some example code and didn't bother trying to learn how it actually works...

Anyway, the ->nodeValue part takes the element and returns the text content (because the element has a single text node child - if it had anything else, I don't know what nodeValue would give).

So, just remove the ->nodeValue and you have your element.

0人赞添加讨论(0) 举报

对你真心纯属浪费

3楼-- · 2020-04-19 07:04

You appear to be asking for the serialized html of a DOMElement? E.g. you want a string containing <a href="http://example.org">link text</a>? (Please make your question clearer.)

$url = 'http://example.com';
$dom = new DOMDocument();
$dom->loadHTMLFile($url);

$anchors = $dom->getElementsByTagName('a');

foreach ($anchors as $a) {
    // Best solution, but only works with PHP >= 5.3.6
    $htmlstring = $dom->saveHTML($a);

    // Otherwise you need to serialize to XML and then fix the self-closing elements
    $htmlstring = saveHTMLFragment($a);
    echo $htmlstring, "\n";
}


function saveHTMLFragment(DOMElement $e) {
    $selfclosingelements = array('></area>', '></base>', '></basefont>',
        '></br>', '></col>', '></frame>', '></hr>', '></img>', '></input>',
        '></isindex>', '></link>', '></meta>', '></param>', '></source>',
    );
    // This is not 100% reliable because it may output namespace declarations.
    // But otherwise it is extra-paranoid to work down to at least PHP 5.1
    $html = $e->ownerDocument->saveXML($e, LIBXML_NOEMPTYTAG);
    // in case any empty elements are expanded, collapse them again:
    $html = str_ireplace($selfclosingelements, '>', $html);
    return $html;
}

However, note that what you are doing is dangerous because it could potentially mix encodings. It is better to have your output as another DOMDocument and use importNode() to copy the nodes you want. Alternatively, use an XSL stylesheet.

0人赞添加讨论(0) 举报

PHP DOMDocument how to get element?

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间