Retrieving/storing All related Actors in Freebase

2020-07-30 03:12发布

问题:

I'm trying to do something that (should be) very simple. Maybe I'm missing something.

I want to get a full list of people a person has worked with. My end goal is to find who he works with most often (in whatever relationship that may be)

Ex: Robert De Niro has acted together with Joe Pesci 7 times, but directed him twice. I want a table with 9 entries as a result.

I can use the topic API, but that only returns a list of movies. I then have to perform 10+ API queries to get the cast of every movie. Takes forever and the code is a nightmare.

If I use MQL search, I can only search for movies that Robert De Niro has starred in, but not every movie he has directed, or written, or produced, or starred in. Basically I can only search by 1 role at a time.

Is there a better way? I need to end up with a list of:

Movies

Actors/Crew People

Roles linking Movies and People

What I do currently:

  • Search for Robert De Niro and obtain the Machine ID
  • Do a topic search for that MID, returning a list of movie MIDs he has worked on
  • Do a topic search for each movie MID, and record fields like directed_by, starring, produced_by, etc

As you can see it's a very expensive operation. It's also extremely difficult to avoid duplicates in this way (though I'm working on it)

edit: Here's my current MQL query (for some reason it only works if I specify two actor names, as well, but that's another issue.

$query = array(array(
                        'a:starring'=>array('actor'=>'Joe Pesci'),
                        'b:starring'=>array('actor'=>'Robert De Niro'),
                        'directed_by'=>null,
                        'produced_by'=>array(),
                        'written_by'=>array(),
                        'executive_produced_by'=>array(),
                            'name'=>null,
                            'mid'=>null,
                        'starring'=>array(array('actor'=>array('mid'=>null,
                                   'name'=>null))),
                        'type'=>'/film/film'
                        ));

The MQL:

    [{
    "a:starring":
        {"actor":"Joe Pesci"},
    "b:starring":
        {"actor":"Robert De Niro"},
    "directed_by":null,
    "produced_by":[],
    "written_by":[],
    "executive_produced_by":[],
    "name":null,
    "mid":null,
    "starring":
        [{"actor":
            {"mid":null,"name":null}}],
    "type":"\/film\/film"}]

回答1:

You can do this all in a single MQL query with different subqueries for the directed/wrote/acted in properties. Just make sure you make each subquery optional.

For example:

[{
  "a:starring": {
    "actor": "Joe Pesci"
  },
  "b:starring": {
    "actor": "Robert De Niro"
  },
  "directed_by": [{
    "id": null,
    "name": null,
    "optional": true
  }],
  "produced_by": [{
    "id": null,
    "name": null,
    "optional": true
  }],
  "written_by": [{
    "id": null,
    "name": null,
    "optional": true
  }],
  "executive_produced_by": [{
    "id": null,
    "name": null,
    "optional": true
  }],
  "name": null,
  "mid": null,
  "starring": [{
    "actor": {
      "mid": null,
      "name": null
    }
  }],
  "type": "/film/film"
}]


回答2:

I don't know MQL but standard SQL would be like this:

Select
p.Name + ' has worked with '+p2.name+' on the movie '+m.Name
from Person p
join Roles r on r.PersonId=p.Id
join Movie m on m.Id=r.MovieId
join Roles r2 on r.MovieId=m.Id and r2.Id <> r.Id
join Person p2 on p2.Id=r2.PersonId