I maintain a local intranet site that among other things, displays movie poster images from IMDB.com. Until recently, I simply had a perl script download the images I needed and save them to the local server. But that became a HUGE space-hog, so I thought I could simply point my site directly to IMDB servers, since my traffic is very minimal.
The result was that some images would display, while others wouldn't. And images that were displayed, would sometimes disappear after a few refreshes. The images existed on the IMDB servers, they just wouldn't display on my page.
It seems unlikely to me that IMDB would somehow block this kind of access, but is that possible? Is there something that needs to be configured on my end?
I'm out of ideas - it just doesn't make sense to me.
I'm serving my pages with mod_perl and HTML::Mason, if that's relevant.
Thanks,
Ryan
Apache/2.2.14 (Unix) mod_ssl/2.2.14 OpenSSL/0.9.8l DAV/2 mod_perl/2.0.4 Perl/v5.10.0
Absolutely they would block that kind of access. You're using their bandwidth, which they have to pay for, for your web site. Sites will often look at the referrer, see that its not coming from their site, and either block or throttle access. Likely you're seeing this as an intermittent problem because IMDB is allowing you some amount of use of their images.
To find out more, look at the HTTP logs on your client. Either by using a browser plugin or by scripting it. Look at the HTTP response codes and you'll probably see some 4xx or 5xx responses.
I would suggest either caching the images in a cache that expires unused images, that will balance accesses with space, or perhaps getting a paid IMDB account. You may be able to get an API key to use to fetch images indicating you are a paying customer.
IMDB sure could be preventing your 'bandwidth theft' by checking the "referer". More info here: http://www.thesitewizard.com/archive/bandwidththeft.shtml
Why is it intermittent? Maybe they only implement this on some of the servers in their web farm.
Just to add to the existing answers, what you're doing is called "hotlinking", and people who run websites don't like it very much. Google for "hotlink blocking".