Curl and relative path in <head>

2019-09-13 20:55发布

站内文章 / PHP

40 0

迷人小祖宗

女 | 书童

私信

可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

Im ising this script to scrape a website:

<?php
$url = "http://www.nu.nl";

$ch = curl_init($url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
$curl_scraped_page = curl_exec($ch);
curl_close($ch);

echo $curl_scraped_page;
?>

The output ads the wrong domain in javascript,css files in the head section. So I tried to fix it with:

$url = preg_replace("/<head>/i", "<head><base href='$url' />", $url, 1);

Doesn't work, any ideas why? I can't spot anything.

Example

回答1:

What about using the right variables? $curl_scraped_page is your page and $url your url... But you passed $url to preg_replace.

$curl_scraped_page = preg_replace("/<head>/i", "<head><base href='$url' />", $curl_scraped_page, 1);

标签： php curl relative-path

迷人小祖宗

女 | 书童

私信

收藏的人(0)

Ta的文章更多文章

0条评论

还没有人评论过~

Curl and relative path in <head>

问题:

回答1:

收藏的人(0)

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮