Getting Wikipedia IDs in MQL

2019-06-03 07:12发布

Freebase WEX dumps contain a wpid column corresponding to the page_id from the source MediaWiki database in the freebase_wpid table. This table provides a mapping between Wikipedia numeric article/redirect IDs and Freebase GUIDs (Global Unique IDs).

guid use as foreign keys is deprecated by mid for lots of good reasons, but that doesn't change the fact that guids are still used at a system level so I'm going to call mid an accessor from here on.

Using the mid accessor is flexible in MQL. One can query using "mid": null and using "mid":[] depending on whether one needs the current mid or every mid.

Finding a list of wpid values per mid is straightforward in MQL:

[{
  "mid": null
  "key": [{"namespace":"/wikipedia/en_id", "value":null}]
}]

But if all is well in the universe, each current mid should have only one current wpid, so is there a way to do something like "wpid": null like one can with the mql accessor?

1条回答
甜甜的少女心
2楼-- · 2019-06-03 07:23

If you only want one wpid value per mid you could do something like this:

[{
  "mid": null,
  "key": {
    "namespace": "/wikipedia/en_id",
    "value":     null,
    "limit":     1
  }
}]​

Try it out

Bare in mind that it is entirely possible that a Freebase topic would have more than one wmid. This happens whenever we need to merge duplicate topics that we've imported from Wikipedia, or if we import them before they get merged in Wikipedia.

If you're looking for links to Wikipedia pages you might also be interested in the /wikipedia/en_title namepace:

[{
  "mid": null,
  "key": {
    "namespace": "/wikipedia/en_title",
    "value":     null,
    "limit":     1
  }
}]​

Try it out

查看更多
登录 后发表回答