-->

Fiware Cygnus: no data have been persisted in CKAN

2019-07-16 19:41发布

问题:

I am trying to use cygnus with CKAN, but no data have been persisted in CKAN when the attribute is JSON type. First, I send information to Orion:

Accept: application/json
X-AUTH-TOKEN: <mytoken>
Fiware-Service: PapelClubDemo
Fiware-ServicePath: /events/leonliterario
{
    "contextElements": [
        {
            "type": "Events",
            "isPattern": "false",
            "id": "thisweek",
            "attributes": [
                {
                    "name": "schedule",
                    "type": "json",
                    "value": [{"title": "Presentación Viva Mi Gente","date": "2015-11-30","location": "Salón de actos del ICAL","url": "http://www.papel.club"}]
                }
            ]
        }
    ],
    "updateAction": "APPEND"
}

I have a suscription in Cygnus for this entity and this is the information I receive in Cygnus log:

01 Dec 2015 19:05:13,701 INFO [891993589@qtp-1988714671-0] (com.telefonica.iot.cygnus.handlers.OrionRestHandler.getEvents:232) - Received data ({"subscriptionId" : "565dd3497b72b7c7092d5a29", "originator" : "localhost", "contextResponses" : [ { "contextElement" : { "type" : "Events","isPattern" : "false", "id" : "thisweek004", "attributes" : [ { "name" : "schedule", "type" : "json", "value" : [ { "title" : "Presentación Viva MiGente", "date" : "2015-11-30", "location" : "Salón de actos del ICAL", "url" : "http://www.papel.club" }, { "title" : "Presentación Viva Mi Gente2","date" : "2015-11-30", "location" : "Salón de actos del ICAL", "url" : "http://www.papel.club" } ] } ] }, "statusCode" : { "code" : "200","reasonPhrase" : "OK" } } ]}) 01 Dec 2015 19:05:13,702 INFO [891993589@qtp-1988714671-0](com.telefonica.iot.cygnus.handlers.OrionRestHandler.getEvents:255) - Event put in the channel (id=2134043204, ttl=10) 01 Dec 2015 19:05:16,842 INFO[SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionCKANSink.persistOne:207) - [ckan-sink] Persisting data atOrionCKANSink (orgName=papelclubdemo, pkgName=papelclubdemo_events_leonliterario, resName=thisweek004_events,data=1448989513702,2015-12-01T17:05:13.702Z,thisweek004,Events,schedule,json,[{"title":"Presentación Viva MiGente","date":"2015-11-30","location":"Salón de actos del ICAL","url":"http://www.papel.club"},{"title":"Presentación Viva MiGente2","date":"2015-11-30","location":"Salón de actos del ICAL","url":"http://www.papel.club"}],[])
01 Dec 2015 19:05:19,479 ERROR [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:224) - Runtime error (Don't know how to treat response code 503)
01 Dec 2015 19:05:19,480 INFO [SinkRunner-PollingRunner-DefaultSinkProcessor] (com.telefonica.iot.cygnus.sinks.OrionSink.process:233) - Finishing transaction (1448984542-601-0000000018)

This is my cygnus agent config:

# Flume handler that will parse the notifications, must not be changed
cygnusagent.sources.http-source.handler = com.telefonica.iot.cygnus.handlers.OrionRestHandler
# URL target
cygnusagent.sources.http-source.handler.notification_target = /notify
# Default service (service semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service = def_serv
# Default service path (service path semantic depends on the persistence sink)
cygnusagent.sources.http-source.handler.default_service_path = def_servpath
# Number of channel re-injection retries before a Flume event is definitely discarded (-1 means infinite retries)
cygnusagent.sources.http-source.handler.events_ttl = 10
# Source interceptors, do not change
cygnusagent.sources.http-source.interceptors = ts gi
# TimestampInterceptor, do not change
cygnusagent.sources.http-source.interceptors.ts.type = timestamp
# GroupinInterceptor, do not change
cygnusagent.sources.http-source.interceptors.gi.type = com.telefonica.iot.cygnus.interceptors.GroupingInterceptor$Builder
# Grouping rules for the GroupingInterceptor, put the right absolute path to the file if necessary
# See the doc/design/interceptors document for more details
cygnusagent.sources.http-source.interceptors.gi.grouping_rules_conf_file = /usr/cygnus/conf/grouping_rules.conf

# ============================================
# OrionCKANSink configuration
# channel name from where to read notification events
cygnusagent.sinks.ckan-sink.channel = ckan-channel
# sink class, must not be changed
cygnusagent.sinks.ckan-sink.type = com.telefonica.iot.cygnus.sinks.OrionCKANSink
# the CKAN API key to use
#cygnusagent.sinks.ckan-sink.api_key = <mykey>
# the FQDN/IP address for the CKAN API endpoint
cygnusagent.sinks.ckan-sink.ckan_host = demo.ckan.org
# the port for the CKAN API endpoint
cygnusagent.sinks.ckan-sink.ckan_port = 80
# Orion URL used to compose the resource URL with the convenience operation URL to query it
cygnusagent.sinks.ckan-sink.orion_url = http://127.0.0.1:1026
# how the attributes are stored, either per row either per column (row, column)
cygnusagent.sinks.ckan-sink.attr_persistence = row
# enable SSL for secure Http transportation; 'true' or 'false'
cygnusagent.sinks.ckan-sink.ssl = false

When cygnus persists data into demo.ckan.org, organization, dataset and resource are created correctyl but data are not loaded.

回答1:

That's because the row mode of OrionCKANSink automatically creates the resources and their associated datastore with all the fields having type "text". Why? Because the types sent by orion are not real types with semantic, but just a description of what the user consider the attribute type is. I mean, an Orion type can be "float" but "float number with precission 4" as well. Thus, the real type of the data cannot be got (without expending a lot of time in heuristics trying to infer the real data type). Thus, the row mode has the advantage of automatically creating the resources (and datastores), but the constraint is all the notified data must be a string.

If you need real Json types in your CKAN datastore, then the recommended mode for Cygnus is the column one.

You can see a more elaborated discussion about the different modes here