PHP DOMXPath query using the innerHTML/nodeValue o

2019-07-12 18:20发布

问题:

Can you please help me with the correct syntax to use when you want to check the innerHTML/nodeValue of an element?

I have no problem with the Name however the Age is within a plain div element, What is the correct syntax to use in place of "NOT SURE WHAT TO PUT HERE" below.

$html is a page from the internet

The persons name is in a span like:

<span class="fullname">John Smith</span>

The persons age is in a div like:

<div>Age: 28</div>

I have the following PHP:

<?php
$dom = new DomDocument();
@$dom->loadHTML($html);
$finder = new DOMXPath($dom);

//Full Name
$findName = "fullname";
$queryName = $finder->query("//span[contains(@class, '$findName')]");
$name = $queryName->item(0)->nodeValue;

//Age
$findAge = "Age: ";
$queryAge = $finder->query("//div[NOT SURE WHAT TO PUT HERE]");
$age = substr($queryAge->item(0)->nodeValue, 5);
?>

回答1:

Try

$queryAge = $finder->query("//div[starts-with(., '$findAge')]");

I've had limited success with starts-with() due to whitespace so you may have to resort to

$queryAge = $finder->query("//div[contains(., '$findAge')]");

If there's a chance of finding false positives (ie, other divs with "Age: " in them), you might be able to avoid that by using a more specific path (if known), ie

$queryAge = $finder->query("//div[@id='something']//div[contains(., '$findAge')]");