get meta description , title and image from url li

2020-03-01 19:00发布

问题:

my code is

       function getTitle($Url){
            $str = file_get_contents($Url);
            if(strlen($str)>0){
                preg_match("/\<title\>(.*)\<\/title\>/",$str,$title);
                return $title[1];
            }
            else
            {
                return false;
            }
        }
        function getMetas($Url){
            $str = file_get_contents($Url);
            if(strlen($str)>0){
             //   preg_match("/\<title\>(.*)\<\/title\>/",$str,$title);
               preg_match("/<meta name=\"description\" content=\"(.*?)\"/",$str,$title);
              //  preg_match( '<meta name="description".*content="([^"]+)">siU', $str, $title);
                return $title[1];
            }
             else
            {
                return false;
            }
        }

        //Example:
        $url=$_POST['url'];
        echo getTitle($url);
        echo "<br><br>";
        echo getMetas($url);

this does not shows result for all the url's , example http://google.com

回答1:

Why are you using regular expression for parsing the <meta> tags ?

PHP has an in-built function for parsing the meta information , it is called the get_meta_tags()

Illustration :

<?php
$tags = get_meta_tags('http://www.stackoverflow.com/');
echo "<pre>";
print_r($tags);

OUTPUT:

Array
(
    [twitter:card] => summary
    [twitter:domain] => stackoverflow.com
    [og:type] => website
    [og:image] => http://cdn.sstatic.net/stackoverflow/img/apple-touch-icon@2.png?v=fde65a5a78c6
    [og:title] => Stack Overflow
    [og:description] => Q&A for professional and enthusiast programmers
    [og:url] => http://stackoverflow.com/
)

As you can see the title , image and description are being parsed which you really want.



回答2:

I know the question is 1.5 years old. But if you are still looking for it, you can use https://urlmeta.org. Its a free API to extract URL meta.



回答3:

You can check a URL for http or https by

$url='stackoverflow.com';
$http_check='http://';
$https_check='https://';
if(substr($url,0,7)!=$http_check){
   $url=$http_check.$url;
}else if(substr($url,0,8)!=$https_check){
   $url=$https_check.$url;
}else{
    $url=$url
}

then you can use the above answer

<?php
$tags = get_meta_tags($url);
echo "<pre>";
print_r($tags);