Find and append hrefs of a certain class

2019-07-15 19:24发布

问题:

I've been searching for a solution to this but haven't found quite the right thing yet.

The situation is this: I need to find all links on a page with a given class (say class="tracker") and then append query string values on the end, so when a user loads a page, those certain links are updated with some dynamic information.

I know how this can be done with Javascript, but I'd really like to adapt it to run server side instead. I'm quite new to PHP, but from the looks of it, XPath might be what I'm looking for but I haven't found a suitable example to get started with. Is there anything like GetElementByClass?

Any help would be greatly appreciated!

Shadowise

回答1:

Is there anything like GetElementByClass?

Here is an implementation I whipped up...

function getElementsByClassName(DOMDocument $domNode, $className) {
    $elements = $domNode->getElementsByTagName('*');
    $matches = array();
    foreach($elements as $element) {
        if ( ! $element->hasAttribute('class')) {
            continue;
        }
        $classes = preg_split('/\s+/', $element->getAttribute('class'));
        if ( ! in_array($className, $classes)) {
            continue;
        }
        $matches[] = $element;
    }
    return $matches;
}

This version doesn't rely on the helper function above.

$str = '<body>
    <a href="">a</a>
        <a href="http://example.com" class="tracker">a</a>
        <a href="http://example.com?hello" class="tracker">a</a>
    <a href="">a</a>
</body>
    ';

$dom = new DOMDocument;

$dom->loadHTML($str);

$anchors = $dom->getElementsByTagName('body')->item(0)->getElementsByTagName('a');

foreach($anchors as $anchor) {

    if ( ! $anchor->hasAttribute('class')) {
        continue;
    }

    $classes = preg_split('/\s+/', $anchor->getAttribute('class'));

    if ( ! in_array('tracker', $classes)) {
        continue;
    }

    $href = $anchor->getAttribute('href');

    $url = parse_url($href);

    $attach = 'stackoverflow=true';

    if (isset($url['query'])) {
        $href .= '&' . $attach;
    } else {
        $href .= '?' . $attach;
    }

    $anchor->setAttribute('href', $href);
}

echo $dom->saveHTML();

Output

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
    <a href="">a</a>
        <a href="http://example.com?stackoverflow=true" class="tracker">a</a>
        <a href="http://example.com?hello&amp;stackoverflow=true" class="tracker">a</a>
    <a href="">a</a>
</body></html>


回答2:

I need to find all links on a page with a given class (say class="tracker") [...] I'm quite new to PHP, but from the looks of it, XPath might be what I'm looking for but I haven't found a suitable example to get started with. Is there anything like GetElementByClass?

This XPath 1.0 expression:

//a[contains(
       concat(' ',normalize-space(@class),' '),
       ' tracker '
    )
]


回答3:

A bit shorter, using xpath:

$dom = new DomDocument();
$dom->loadXml('<?xml version="1.0" encoding="UTF-8" ?>
<root>
    <a href="somlink" class="tracker foo">label</a>
    <a href="somlink" class="foo">label</a>
    <a href="somlink">label</a>
    <a href="somlink" class="atrackerb">label</a>
    <a href="somlink">label</a>
    <a href="somlink" class="tracker">label</a>
    <a href="somlink" class="tracker">label</a>
</root>');

$xpath = new DomXPath($dom);

foreach ($xpath->query('//a[contains(@class, "tracker")]') as $node) {
    if (preg_match('/\btracker\b/', $node->getAttribute('class'))) {
        $node->setAttribute(
            'href',
            $node->getAttribute('href') . '#some_extra'
        );
    }

}

header('Content-Type: text/xml; charset"UTF-8"');
echo $dom->saveXml();