Merge multiple columns in bulkloader

I'm using app engine's bulkloader to import a CSV file into my datastore. I've got a number of columns that I want to merge into one, for example they're all URLs, but not all of them are supplied and there is a superseding order, eg:

url_main
url_temp
url_test

I want to say: "Ok, if url_main exists, use that, otherwise user url_test and then use url_temp"

Is it, therefore, possible to create a custom import transform that references columns and merges them into one based on conditions?

标签： google-app-engine bulkloader

1条回答

劫难

2楼-- · 2019-08-26 22:38

Ok, so after reading https://developers.google.com/appengine/docs/python/tools/uploadingdata#Configuring_the_Bulk_Loader I learnt about import_transform and that this can use custom functions.

With that in mind, this pointed me the right way:

... a two-argument function with the keyword argument bulkload_state, which on return contains useful information about the entity: bulkload_state.current_entity, which is the current entity being processed; bulkload_state.current_dictionary, the current export dictionary ...

So, I created a function that handled two variables, one would be the value of the current entity and the second would be the bulkload_state that allowed me to fetch the current row, like so:

def check_url(value, bulkload_state):
    row = bulkload_state.current_dictionary
    fields = [ 'Final URL', 'URL', 'Temporary URL' ]

    for field in fields:
        if field in row:
            return row[ field ]


    return None

All this does is grab the current row (bulkload_state.current_dictionary) and then checks which URL fields exist, otherwise it just returns None.

In my bulkloader.yaml I call this function simply by setting:

- property: business_url
  external_name: URL
  import_transform: bulkloader_helper.check_url

Note: the external_name doesn't matter, as long as it exists as I'm not actually using it, I'm making use of multiple columns.

Simples!

0人赞添加讨论(0) 举报

Merge multiple columns in bulkloader

采纳回答

编辑标签

举报内容

检举类型

检举原因

检举说明(必填)

打开微信“扫一扫”，打开网页后点击屏幕右上角分享按钮

付费偷看金额在0.1-10元之间