I am using the Twitter and Facewbook API to pull posts that potentially contain shortened URLs using bit.ly or TinyURL like services. I need to do a real-time expansion to get the original URL then pull content from that URL into my app.
可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
回答1:
You can use CURL to expand a short URL.
Try this:
function traceUrl($url, $hops = 0)
{
if ($hops == MAX_URL_HOPS)
{
throw new Exception('TOO_MANY_HOPS');
}
$ch = curl_init($url);
curl_setopt($ch, CURLOPT_HEADER, 1);
curl_setopt($ch, CURLOPT_NOBODY, 1);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$r = curl_exec($ch);
if (preg_match('/Location: (?P<url>.*)/i', $r, $match))
{
return traceUrl($match['url'], $hops + 1);
}
return rtrim($url);
}
You can use this function as so traceUrl('http://bit.ly/example')
. This function is recursive in the sense that it will even find short urls that are shortened (if it ever happens). Make sure you set the MAX_URL_HOPS
constant. I use define('MAX_URL_HOPS', 5);
.
- Christian
回答2:
You can just use PHP and CURL to connect to the URL and get back the Location
parameter:
Here is what comes back -
> $ curl -I http://bit.ly/2V6CFi
> HTTP/1.1 301 Moved Server:
> nginx/0.7.67 Date: Tue, 21 Dec 2010
> 01:58:47 GMT Content-Type: text/html;
> charset=utf-8 Connection: keep-alive
> Set-Cookie:
> _bit=4d1009d7-00298-02f7f-c6ac8fa8;domain=.bit.ly;expires=Sat
> Jun 18 21:58:47 2011;path=/; HttpOnly
> Cache-control: private; max-age=90
> Location: http://www.google.com/
> MIME-Version: 1.0
Content-Length: 284
So you can look for the Location parameter in the header to see where the page page actually goes.
回答3:
With nodejs you can use the module request.
var request = require('request');
var shortUrl = 'the url that is shortened'
request({method: 'HEAD', url: shortUrl, followAllRedirects: true},
function(err, response, body){
console.log(response.request.href);
})
回答4:
I found a php library that does just that, it can be useful. Check it out: https://launchpad.net/longurl