I am trying to read a website's content but i have a problem i want to get images, links these elements but i want to get elements them selves not the element content for instance i want to get that: i want to get that entire element.
How can i do this..
<?php
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://www.link.com");
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
$output = curl_exec($ch);
$dom = new DOMDocument;
@$dom->loadHTML($output);
$items = $dom->getElementsByTagName('a');
for($i = 0; $i < $items->length; $i++) {
echo $items->item($i)->nodeValue . "<br />";
}
curl_close($ch);;
?>
I'm assuming you just copy-pasted some example code and didn't bother trying to learn how it actually works...
Anyway, the
->nodeValue
part takes the element and returns the text content (because the element has a single text node child - if it had anything else, I don't know whatnodeValue
would give).So, just remove the
->nodeValue
and you have your element.You appear to be asking for the serialized html of a DOMElement? E.g. you want a string containing
<a href="http://example.org">link text</a>
? (Please make your question clearer.)However, note that what you are doing is dangerous because it could potentially mix encodings. It is better to have your output as another DOMDocument and use
importNode()
to copy the nodes you want. Alternatively, use an XSL stylesheet.