Extract related articles in different languages us

2019-06-05 16:02发布

I'm trying to extract interlanguage related articles in Wikidata dump. After searching on the internet, I found out there is a tool named Wikidata Toolkit that helps to work with these type of data. But there is no information about how to find related articles in different languages. For example, the article: "Dresden" in the English language is related to the article: "Dresda" in the Italiano one. I mean the second one is the translated version of the first one. I tried to use the toolkit, but I couldn't find any solution. Please write some example about how to find this related article.

1条回答
Luminary・发光体
2楼-- · 2019-06-05 16:13

you can use Wikidata dump [1] to get a mapping of articles among wikipedias in multiple language.

for example if you see the wikidata entry for Respiratory System[2] at the bottom you see all the articles referring to the same topic in other languages.

That mapping is available in the wikidata dump. Just download wikidata dump and get the mapping and then get the corresponding text from the wikipedia dump. You might encounter some other issues, like resolving wikipedia redirects.

[1] https://dumps.wikimedia.org/wikidatawiki/entities/ [2] https://www.wikidata.org/wiki/Q7891

查看更多
登录 后发表回答