I am trying to filter html tables with regex matching their id attribute. What am i doing wrong? Code i am trying to implement:
$this->xpath = new DOMXPath($this->dom);
$this->xpath->registerNamespace("php", "http://php.net/xpath");
$this->xpath->registerPHPFunctions();
foreach($xpath->query("//table[php:function('preg_match', '/post\d+/', @id)]") as $key => $row)
{
}
Error that i get: preg_match expects second param to be a string, array given.
An attribute is still a complex element according to DOM (has a namespace etc.). Use:
//table[php:function('preg_match', '/post\d+/', string(@id))]
Now, we need a boolean return, so:
function booleanPregMatch($match,$string){
return preg_match($match,$string)>0;
}
$xpath->registerPHPFunctions();
foreach($xpath->query("//table[@id and php:function('booleanPregMatch', '/post\d+/', string(@id))]") as $key => $row){
echo $row->ownerDocument->saveXML($row);
}
BTW: for more complex issues, you can of course sneakily check what's happening with this:
//table[php:function('var_dump',@id)]
It's a shame we don't have XPATH 2.0 functions available, but if you can handle this requirement with a more unreliable starts-with
, I'd always prefer that over importing PHP functions.
What am i doing wrong?
The xpath expression @id
(second parameter) returns an array but preg_match
expects a string.
Convert it to string first: string(@id)
.
Next to that you need to actually compare the output to 1 as preg_match
returns 1
when found:
foreach($xpath->query("//table[@id and 1 = php:function('preg_match', '/post\d+/', string(@id))]") as $key => $row)
{
var_dump($key, $row, $row->ownerDocument->saveXml($row));
}
Explanation/What happens here?:
A xpath expression will by default return a node-list (more precisely node-set). If you map a PHP function onto such expressions these sets are represented in form of an array. You can easily tests that by using var_dump
:
$xpath->query("php:function('var_dump', //table)");
array(1) {
[0]=>
object(DOMElement)#3 (0) {
}
}
Same for the xpath expression @id
in the context of each table element:
$xpath->query("//table[php:function('var_dump', @id)]");
array(1) {
[0]=>
object(DOMAttr)#3 (0) {
}
}
You can change that into a string typed result by making use of the xpath string
function:
A node-set is converted to a string by returning the string-value of the node in the node-set that is first in document order. If the node-set is empty, an empty string is returned.
$xpath->query("//table[php:function('var_dump', string(@id))]");
string(4) "test"
(the table has id="test"
)