-->

Getting readable results from Wikidata

2020-01-31 02:31发布

问题:

Ok so I'm trying to get information from Wikidata about movies, take this movie for example: https://www.wikidata.org/wiki/Q24871

On the page the data is clearly displayed in a readable format, however when you trying to extract it via the API you get this: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q24871

Here is a section from it:

"P272": [
                {
                    "id": "q24871$4721C959-0FCF-49D4-9265-E4FAC217CB6E",
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P272",
                        "datatype": "wikibase-item",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 775450
                            },
                            "type": "wikibase-entityid"
                        }
                    },
                    "type": "statement",
                    "rank": "normal"
                },
                {
                    "id": "q24871$31777445-1068-4C38-9B4B-96362577C442",
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P272",
                        "datatype": "wikibase-item",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 3041294
                            },
                            "type": "wikibase-entityid"
                        }
                    },
                    "type": "statement",
                    "rank": "normal"
                },
                {
                    "id": "q24871$08009F7A-8E54-48C3-92D9-75DEF4CF3E8D",
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P272",
                        "datatype": "wikibase-item",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 646968
                            },
                            "type": "wikibase-entityid"
                        }
                    },
                    "type": "statement",
                    "rank": "normal"
                },
                {
                    "id": "q24871$CA53B5EB-1041-4701-A36E-7C348FAC984E",
                    "mainsnak": {
                        "snaktype": "value",
                        "property": "P272",
                        "datatype": "wikibase-item",
                        "datavalue": {
                            "value": {
                                "entity-type": "item",
                                "numeric-id": 434841
                            },
                            "type": "wikibase-entityid"
                        }
                    },
                    "type": "statement",
                    "rank": "normal",
                    "references": [
                        {
                            "hash": "50f57a3dbac4708ce4ae4a827c0afac7fcdb4a5c",
                            "snaks": {
                                "P143": [
                                    {
                                        "snaktype": "value",
                                        "property": "P143",
                                        "datatype": "wikibase-item",
                                        "datavalue": {
                                            "value": {
                                                "entity-type": "item",
                                                "numeric-id": 11920
                                            },
                                            "type": "wikibase-entityid"
                                        }
                                    }
                                ]
                            },
                            "snaks-order": [
                                "P143"
                            ]
                        }
                    ]
                }
            ],

The problem is I'm not sure how to convert sections like that into readable text. I get the API is calling a link between a class and its properties using unique IDs but I'm still stuck.

Is this actually possible at present or am I barking up the wrong tree?

回答1:

What you should be looking for are the numeric-ids in each statements and add a leading Q to recover your wikidata ids, which should result to ['Q775450', 'Q3041294', 'Q646968', 'Q434841', 'Q11920']

[update: you can now directly access the Q id at mainsnak.datavalue.value.id, instead of having to build it from the numeric-id]

This can be done using wikidata-sdk (a JS lib I developed) simplifyClaims function

Once you got those ids, you just need to request entities labels using the wbgetentities API: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q775450|Q3041294|Q646968|Q434841|Q11920&format=json&props=labels

you can even get results for only some languages, using the languages parameter: https://www.wikidata.org/w/api.php?action=wbgetentities&ids=Q775450|Q3041294|Q646968|Q434841|Q11920&format=json&props=labels&languages=en|de|fr



回答2:

Ok so I haven't found a solution to using the This is the "wbgetentities" system I have found that you can use the "parse" command to get the html structure.

https://www.wikidata.org/w/api.php?action=parse&page=Q24871

While it still going to need some processing its much easier than the previous solution.



回答3:

I see an accepted answer, but initially interpreted the question differently. Basically asking to have the same output in JSON one sees on the Wikidata item page.

SPARQL query with JSON output for above case: https://query.wikidata.org/sparql?query=SELECT%20%3FwdLabel%20%3Fps_Label%20%3FwdpqLabel%20%3Fpq_Label%20%7B%0A%20%20VALUES%20(%3Fcompany)%20%7B(wd%3AQ24871)%7D%0A%0A%20%20%3Fcompany%20%3Fp%20%3Fstatement%20.%0A%20%20%3Fstatement%20%3Fps%20%3Fps_%20.%0A%0A%20%20%3Fwd%20wikibase%3Aclaim%20%3Fp.%0A%20%20%3Fwd%20wikibase%3AstatementProperty%20%3Fps.%0A%0A%20%20OPTIONAL%20%7B%0A%20%20%3Fstatement%20%3Fpq%20%3Fpq_%20.%0A%20%20%3Fwdpq%20wikibase%3Aqualifier%20%3Fpq%20.%0A%20%20%7D%0A%0A%20%20SERVICE%20wikibase%3Alabel%20%7B%20bd%3AserviceParam%20wikibase%3Alanguage%20%22en%22%20%7D%0A%7D&format=json

I use the Wikidata Query Front End to get my query straight and to check the results. Then use the </> Code button... explaining why you're seeing so much unnecessary whitespace above.

See also:

  • wikidata get all properties with labels and values of an item
  • SPARQL query service - Interfacing