preg_replace to modify SRC and HREF urls coming fr

2019-07-22 11:51发布

I need to replace urls in the page taken by curl and add correct link to images and links. My php curl code is:

<?php

function getPage($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);

    $result = curl_exec($ch);
    curl_close($ch);

    if (!preg_match('/src="https?:\/\/"/', $result))
    $result = preg_replace('/src="(.*)"/', "src=\"http://support.prophpbb.com/\\1\"", $result);
    if (!preg_match('/href="https?:\/\/"/', $result))
    $result = preg_replace('/href="(.*)"/', "href=\"http://support.prophpbb.com/\\1\"", $result);
    return $result;
}

$result = getPage('http://support.prophpbb.com/');

print_r ($result);

?>

This code working ok for some links, but for correct links it make duplicate.

From wrong link, Is replaced with correct:

<img src="./uploads/support/images/1355955233.png" alt="" title="" />
<img src="http://support.prophpbb.com/./uploads/support/images/1355955233.png" alt="" title="" />

But correct links, Is replaced with wrong:

<img src="http://support.prophpbb.com/styles/subsilverPlus/theme/images/icon_mini_faq.gif" width="12" height="13" alt="*" />
<img src="http://support.prophpbb.com/http://support.prophpbb.com/styles/subsilverPlus/theme/images/icon_mini_faq.gif" width="12" height="13" alt="*" />

Can anybody help me please?

1条回答
你好瞎i
2楼-- · 2019-07-22 12:35

Try this regular expression in preg_replace

<?php

function getPage($url) {
    $ch = curl_init();
    curl_setopt($ch, CURLOPT_URL, $url);
    curl_setopt($ch, CURLOPT_HEADER, 1);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 30);

    $result = curl_exec($ch);
    curl_close($ch);

    if (!preg_match('/src="https?:\/\/"/', $result)) {
        $result = preg_replace('/src="(http:\/\/([^\/]+)\/)?([^"]+)"/', "src=\"http://support.prophpbb.com/\\3\"", $result);
    }
    if (!preg_match('/href="https?:\/\/"/', $result)) {
        $result = preg_replace('/href="(http:\/\/([^\/]+)\/)?([^"]+)"/', "href=\"http://support.prophpbb.com/\\3\"", $result);
    }
    return $result;
}

$result = getPage('http://support.prophpbb.com/');

print_r ($result);
查看更多
登录 后发表回答