movie id tt0438097 can be found at http://www.imdb.com/title/tt0438097/
What's the url for its poster image?
movie id tt0438097 can be found at http://www.imdb.com/title/tt0438097/
What's the url for its poster image?
As I'm sure you know, the actual url for that image is
http://ia.media-imdb.com/images/M/MV5BMTI0MDcxMzE3OF5BMl5BanBnXkFtZTcwODc3OTYzMQ@@._V1._SX100_SY133_.jpg
You're going to be hard pressed to figure out how it's generated though and they don't seem to have a publicly available API.
Screenscraping is probably your best bet.
The picture seems to generally be inside a div with class=photo and the name of the a tag is poster.
The image itself is just inside the a tag.
Check out http://www.imdbapi.com/, It returns Poster url in string.
For example, check http://www.imdbapi.com/?i=&t=inception and you'll get the poster address: Poster":"http://ia.media-imdb.com/images/M/MV5BMjAxMzY3NjcxNF5BMl5BanBnXkFtZTcwNTI5OTM0Mw@@._V1._SX320.jpg"
Update: Seems like the site owner had some arguments with IMDB legal staff. As mentioned in the original site, new site's address is http://www.omdbapi.com/
The URL is a random string as far as I can tell.
It can still be easily retrieved. It is the only img
inside the anchor named poster
.
So, if you are reading the source, simply search for <a name="poster"
and it will be the text following the first src="
from there.
However, you will need to keep the screen scraping code updated because that will probably change.
You should also be aware that the images are copyrighted, so be careful to only use the image under a good "fair use" rationale.
If a thumb is enough, you can use the Facebook Graph API: http://graph.facebook.com/?ids=http://www.imdb.com/title/tt0438097/
Gets you a thumbnail: http://profile.ak.fbcdn.net/hprofile-ak-ash2/50289_117058658320339_650214_s.jpg
I know that it is way too late, but in my project I used this:-
omdbapi works, but I found out you cannot really use these images (because of screen scraping and they are blocked anyway if you use them in an img tag)
The best solution is to use tmdb.org :
1 use your imdbid in this api url:
https://api.themoviedb.org/3/find/tt0111161?api_key=___YOURAPIKEY___&external_source=imdb_id
2 Retrieve the json response and select the poster_path
attribute:
"poster_path":"/9O7gLzmreU0nGkIB6K3BsJbzvNv.jpg"
3 Prepend this path with "http://image.tmdb.org/t/p/original"
, and you will have the poster URL that you can use in an img tag :-)
4 You can even change sizes like this:
http://image.tmdb.org/t/p/original/9O7gLzmreU0nGkIB6K3BsJbzvNv.jpg
http://image.tmdb.org/t/p/w150/9O7gLzmreU0nGkIB6K3BsJbzvNv.jpg
You can use imdb-cli
tool to download movie's poster, e.g.
omdbtool -t "Ice Age: The Meltdown" | wget `sed -n '/^poster/{n;p;}'`
Be aware tough, that the terms of service explicitly forbid screenscraping. You can download the IMDB database as a set of text files, but as I understand it, the IMDB movie ID is nowhere to be found in these text files.
You can use Trakt API, you have to make a search request with the imdb ID, and the Json result given by Trakt API contains links for two images of that movie (poster and fan art) http://trakt.tv/api-docs/search-movies
I've done something similar using phantomjs and wget. This bit of phantomjs accepts a search query and returns the first result's movie poster url. You could easily change it to your needs.
var system = require('system');
if (system.args.length === 1) {
console.log('Usage: moviePoster.js <movie name>');
phantom.exit();
}
var formattedTitle = encodeURIComponent(system.args[1]).replace(/%20/g, "+");
var page = require('webpage').create();
page.open('http://m.imdb.com/find?q=' + formattedTitle, function() {
var url = page.evaluate(function() {
return 'http://www.imdb.com' + $(".title").first().find('a').attr('href');
});
page.close();
page = require('webpage').create();
page.open(url, function() {
var url = page.evaluate(function() {
return 'http://www.imdb.com' + $("#img_primary").find('a').attr('href');
});
page.close();
page = require('webpage').create();
page.open(url, function() {
var url = page.evaluate(function() {
return $(".photo").first().find('img').attr('src');
});
console.log(url);
page.close();
phantom.exit();
});
});
});
I download the image using wget for many movies in a directory using this bash script. The mp4 files have names that the IMDB likes, and that's why the first search result is nearly guaranteed to be correct. Names like "Love Exposure (2008).mp4".
for file in *.mp4; do
title="${file%.mp4}"
if [ ! -f "${title}.jpg" ]
then
wget `phantomjs moviePoster.js "$title"` -O "${title}.jpg"
fi
done
Then minidlna uses the movie poster when it builds the thumbnail database, because it has the same name as the video file.
$Movies = Get-ChildItem -path "Z:\MOVIES\COMEDY" | Where-Object {$_.Extension -eq ".avi" -or $_.Extension -eq ".mp4" -or $_.Extension -eq ".mkv" -or $_.Extension -eq<br> <br>".flv" -or $_.Extension -eq ".xvid" -or $_.Extension -eq ".divx"} | Select-Object Name, FullName | Sort Name <br>
#Grab all the extension types and filter the ones I ONLY want <br>
<br>
$COMEDY = ForEach($Movie in $Movies) <br>
{<br>
$Title = $($Movie.Name)<br>
#Remove the file extension<br>
$Title = $Title.split('.')[0] <br>
<br>
#Changing the case to all lower <br>
$Title = $Title.ToLower()<br>
<br>
#Replace a space w/ %20 for the search structure<br>
$searchTitle = $Title.Replace(' ','%20') <br>
<br>
#Fetching search results<br>
$moviesearch = Invoke-WebRequest "http://www.imdb.com/search/title?title=$searchTitle&title_type=feature"<br>
<br>
#Moving html elements into variable<br>
$titleclassarray = $moviesearch.AllElements | where Class -eq 'title' | select -First 1<br>
<br>
#Checking if result contains movies<br>
try<br><br>
{
$titleclass = $titleclassarray[0]<br>
}<br>
catch<br>
{<br>
Write-Warning "No movie found matching that title http://www.imdb.com/search/title?title=$searchTitle&title_type=feature"<br>
} <br>
<br>
#Parcing HTML for movie link<br>
$regex = "<\s*a\s*[^>]*?href\s*=\s*[`"']*([^`"'>]+)[^>]*?>"<br>
$linksFound = [Regex]::Matches($titleclass.innerHTML, $regex, "IgnoreCase")<br>
<br><br>
#Fetching the first result from <br>
$titlelink = New-Object System.Collections.ArrayList<br>
foreach($link in $linksFound)<br>
{<br>
$trimmedlink = $link.Groups[1].Value.Trim()<br>
if ($trimmedlink.Contains('/title/'))<br>
{<br>
[void] $titlelink.Add($trimmedlink)<br>
}<br>
}<br>
#Fetching movie page<br>
$movieURL = "http://www.imdb.com$($titlelink[0])"<br>
<br>
#Grabbing the URL for the Movie Poster<br>
$MoviePoster = ((Invoke-WebRequest –Uri $movieURL).Images | Where-Object {$_.title -like "$Title Poster"} | Where src -like "http:*").src <br>
<br>
$MyVariable = "<a href=" + '"' + $($Movie.FullName) + '"' + " " + "title='$Title'" + ">"<br>
$ImgLocation = "<img src=" + '"' + "$MoviePoster" + '"' + "width=" + '"' + "225" + '"' + "height=" + '"' + "275" + '"' + "border=" + '"' + "0" + '"' + "alt=" +<br> '"' + $Title + '"' + "></a>" + " " + " " + " "+ " " + " " + " "+ " " + " " + " "<br>
<br>
Write-Output $MyVariable, $ImgLocation<br>
<br>
}$COMEDY | Out-File z:\db\COMEDY.htm <br>
<br>
$after = Get-Content z:\db\COMEDY.htm <br>
<br>
#adding a back button to the Index <br>
$before = Get-Content z:\db\before.txt<br>
<br>
#adding the back button prior to the poster images content<br>
Set-Content z:\db\COMEDY.htm –value $before, $after<br>
Those poster images don't appear to have any correlation to the title page, so you'll have to retrieve the title page first, and then retrieve the img element for the page. The good news is that the img tag is wrapped in an a tag with name="poster". You didn't say what kind of tools you are using, but this basically a screen scraping operation.
Here is my program to generate human readable html summary page for movie companies found on imdb page. Change the initial url to your liking and it generates a html file where you can see title, summary, score and thumbnail.
npm install -g phantomjs
Here is the script, save it to imdb.js
var system = require('system');
var page = require('webpage').create();
page.open('http://www.imdb.com/company/co0026841/?ref_=fn_al_co_1', function() {
console.log('Fetching movies list');
var movies = page.evaluate(function() {
var list = $('ol li');
var json = []
$.each(list, function(index, listItem) {
var link = $(listItem).find('a');
json.push({link: 'http://www.imdb.com' + link.attr('href')});
});
return json;
});
page.close();
console.log('Found ' + movies.length + ' movies');
fetchMovies(movies, 0);
});
function fetchMovies(movies, index) {
if (index == movies.length) {
console.log('Done');
console.log('Generating HTML');
genHtml(movies);
phantom.exit();
return;
}
var movie = movies[index];
console.log('Requesting data for '+ movie.link);
var page = require('webpage').create();
page.open(movie.link, function() {
console.log('Fetching data');
var data = page.evaluate(function() {
var title = $('.title_wrapper h1').text().trim();
var summary = $('.summary_text').text().trim();
var rating = $('.ratingValue strong').attr('title');
var thumb = $('.poster img').attr('src');
if (title == undefined || thumb == undefined) {
return null;
}
return { title: title, summary: summary, rating: rating, thumb: thumb };
});
if (data != null) {
movie.title = data.title;
movie.summary = data.summary;
movie.rating = data.rating;
movie.thumb = data.thumb;
console.log(movie.title)
console.log('Request complete');
} else {
movies.slice(index, 1);
index -= 1;
console.log('No data found');
}
page.close();
fetchMovies(movies, index + 1);
});
}
function genHtml(movies) {
var fs = require('fs');
var path = 'movies.html';
var content = Array();
movies.forEach(function(movie) {
var section = '';
section += '<div>';
section += '<h3>'+movie.title+'</h3>';
section += '<p>'+movie.summary+'</p>';
section += '<p>'+movie.rating+'</p>';
section += '<img src="'+movie.thumb+'">';
section += '</div>';
content.push(section);
});
var html = '<html>'+content.join('\n')+'</html>';
fs.write(path, html, 'w');
}
And run it like so
phantomjs imdb.js
$Title = $($Movie.Name)
$searchTitle = $Title.Replace(' ','%20')
$moviesearch = Invoke-WebRequest "http://www.imdb.com/search/title?title=$searchTitle&title_type=feature"
$titleclassarray = $moviesearch.AllElements | where Class -eq 'loadlate' | select -First 1
$MoviePoster = $titleclassarray.loadlate
After playing around with @Hawk's BASE64 discovery above, I found that everything after the BASE64 code is display info. If you remove everything between the last @
and .jpg
it will load the image in the highest res it has.
https://m.media-amazon.com/images/M/MV5BMjAwODg3OTAxMl5BMl5BanBnXkFtZTcwMjg2NjYyMw@@._V1_UX182_CR0,0,182,268_AL_.jpg
becomes
https://m.media-amazon.com/images/M/MV5BMjAwODg3OTAxMl5BMl5BanBnXkFtZTcwMjg2NjYyMw@@.jpg
Now a days, all modern browser have "Inspect" section:
100% Correct for Google Chrome only:
Try to paste it any where as URL in any browser, you will only get the image.