I am having some issues with the app engine's bulkloader. Below I have inserted both the bulkloader.yaml, hs_transformers.py, and error log. Any idea as to what is generating this error? My hs_transformer
function works if I return a single entity (just an entity, not a list with one entity in it, that also throws the error), but when I try to return a list of entities this error occurs. According to app engine's documents I should be able to return a list of entities.
.yaml file:
python_preamble:
- import: re
- import: base64
- import: google.appengine.ext.bulkload.transform
- import: google.appengine.ext.bulkload.bulkloader_wizard
- import: google.appengine.ext.db
- import: google.appengine.api.datastore
- import: hs_transformers
- import: datetime
transformers:
- kind: HBO
connector: csv
property_map:
- property: __key__
external_name: swfServerID
import_transform: hs_transformers.string_null_converter
- property: IP_address
external_name: IP
import_transform: hs_transformers.string_null_converter
- property: name
external_name: swfServer
import_transform: hs_transformers.swf_server_converter
- property: last_checkin_date
external_name: clockStampOfLastCheckin
import_transform: hs_transformers.clock_stamp_of_last_checkin_converter
# - property: last_update
# external_name: clockStampOfLastUpdate
# import_transform: transform
- property: form_factor
external_name: formFactor
import_transform: hs_transformers.string_null_converter
- property: serial_number
external_name: serialNumber
import_transform: hs_transformers.string_null_converter
- property: allow_reverse_SSH
external_name: allowReverseSSH
import_transform: hs_transformers.boolean_converter
- property: insight_account
external_name: FK_insightAccountID
import_transform: hs_transformers.integer_converter
- property: version
external_name: ver
import_transform: hs_transformers.string_null_converter
post_import_function: hs_transformers.post_hbo
hs_transformers.py
def post_hbo(input_dict, entity_instance, bulkload_state):
return_entities = []
model_key = db.Key.from_path("Contact", 1)
logging.error("MODEL KEY " +str(model_key))
logging.error("MODEL KEY TYPE " +str(type(model_key)))
keys = db.allocate_ids(model_key, 1)
logging.error("KEYS " +str(keys))
logging.error("KEYS TYPE " +str(type(keys)))
id = keys[0]
logging.error("ID " +str(id))
logging.error("ID TYPE " +str(type(id)))
contact_key = db.Key.from_path("Contact", id)
logging.error("CONTACT KEY " +str(contact_key))
logging.error("CONTACT KEY TYPE " +str(type(contact_key)))
hbo_key = db.Key.from_path("HBO", input_dict["swfServerID"])
logging.error("HBO KEY " +str(hbo_key))
logging.error("HBO KEY TYPE " +str(type(hbo_key)))
contact = Contact(key=contact_key)
map = HBOContact()
map.hbo = hbo_key
map.contact = contact_key
return_entities.append(contact)
return_entities.append(map)
logging.error("CONTACT KEY AGAIN? " +str(contact.key()))
logging.error("CONTACT TYPE " +str(type(contact)))
logging.error("MAP TYPE " +str(type(map)))
logging.error("RETURN LIST " + str(return_entities))
return return_entities
And lastly
Microsoft Windows [Version 6.1.7600]
Copyright (c) 2009 Microsoft Corporation. All rights reserved.
C:\Users\Jack Frost>cd..
C:\Users>cd..
C:\>cd "Program Files (x86)"
C:\Program Files (x86)>cd "Google App Engine SDK"
C:\Program Files (x86)\Google App Engine SDK>python appcfg.py upload_data --url=http://bulkloader-testing.appspot.com/remote_api --config_file="C:\Users\Jack Frost\Eclipse Workspace\Headsprout\GAE 1.27.2012\src\utilities\bulkloader\bulkloader.yaml" --filename="C:\Users\Jack Frost\Eclipse Workspace\Headsprout\GAE 1.27.2012\src\utilities\bulkloader\csv_files\smallhbos.csv" --kind=HBO
Uploading data records.
[INFO ] Logging to bulkloader-log-20120131.160426
[INFO ] Throttling transfers:
[INFO ] Bandwidth: 250000 bytes/second
[INFO ] HTTP connections: 8/second
[INFO ] Entities inserted/fetched/modified: 20/second
[INFO ] Batch Size: 10
[INFO ] Opening database: bulkloader-progress-20120131.160426.sql3
[INFO ] Connecting to bulkloader-testing.appspot.com/remote_api
[INFO ] Starting import; maximum 10 entities per post
2012-01-31 16:04:27,135 ERROR hs_transformers.py:66 type object 'datetime.datetime' has no attribute 'datetime'
2012-01-31 16:04:27,137 ERROR hs_transformers.py:17 MODEL KEY ahRzfmJ1bGtsb2FkZXItdGVzdGluZ3INCxIHQ29udGFjdBgBDA
2012-01-31 16:04:27,138 ERROR hs_transformers.py:18 MODEL KEY TYPE <class 'google.appengine.api.datastore_types.Key'>
2012-01-31 16:04:27,461 ERROR hs_transformers.py:20 KEYS (16031L, 16031L)
2012-01-31 16:04:27,463 ERROR hs_transformers.py:21 KEYS TYPE <type 'tuple'>
2012-01-31 16:04:27,463 ERROR hs_transformers.py:23 ID 16031
2012-01-31 16:04:27,464 ERROR hs_transformers.py:24 ID TYPE <type 'long'>
2012-01-31 16:04:27,466 ERROR hs_transformers.py:27 CONTACT KEY ahRzfmJ1bGtsb2FkZXItdGVzdGluZ3IOCxIHQ29udGFjdBiffQw
2012-01-31 16:04:27,466 ERROR hs_transformers.py:28 CONTACT KEY TYPE <class 'google.appengine.api.datastore_types.Key'>
2012-01-31 16:04:27,467 ERROR hs_transformers.py:30 HBO KEY ahRzfmJ1bGtsb2FkZXItdGVzdGluZ3IKCxIDSEJPIgEzDA
2012-01-31 16:04:27,467 ERROR hs_transformers.py:31 HBO KEY TYPE <class 'google.appengine.api.datastore_types.Key'>
2012-01-31 16:04:27,469 ERROR hs_transformers.py:42 CONTACT KEY AGAIN? ahRzfmJ1bGtsb2FkZXItdGVzdGluZ3IOCxIHQ29udGFjdBiffQw
2012-01-31 16:04:27,469 ERROR hs_transformers.py:43 CONTACT TYPE <class 'shared.datastore.Contact'>
2012-01-31 16:04:27,470 ERROR hs_transformers.py:44 MAP TYPE <class 'shared.datastore.HBOContact'>
2012-01-31 16:04:27,470 ERROR hs_transformers.py:46 RETURN LIST [<shared.datastore.Contact object at 0x0000000003DBBB00>, <shared.datastore.HBOContact object at 0x0000000003DBBC18>]
[ERROR ] [WorkerThread-0] WorkerThread:
Traceback (most recent call last):
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\tools\adaptive_thread_pool.py", line 176, in WorkOnItems
status, instruction = item.PerformWork(self.__thread_pool)
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\tools\bulkloader.py", line 764, in PerformWork
transfer_time = self._TransferItem(thread_pool)
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\tools\bulkloader.py", line 933, in _TransferItem
self.content = self.request_manager.EncodeContent(self.rows)
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\tools\bulkloader.py", line 1394, in EncodeContent
entity = loader.create_entity(values, key_name=key, parent=parent)
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\ext\bulkload\bulkloader_config.py", line 446, in create_entity
self.__track_max_id(entity)
File "C:\Program Files (x86)\Google App Engine SDK\google\appengine\ext\bulkload\bulkloader_config.py", line 420, in __track_max_id
elif not entity.has_key():
AttributeError: 'list' object has no attribute 'has_key'
[INFO ] [WorkerThread-1] Backing off due to errors: 1.0 seconds
[INFO ] An error occurred. Shutting down...
[ERROR ] Error in WorkerThread-0: 'list' object has no attribute 'has_key'
[INFO ] 9 entities total, 0 previously transferred
[INFO ] 0 entities (2364 bytes) transferred in 1.5 seconds
[INFO ] Some entities not successfully transferred
Also thought I would paste what code.google.com has to say about the post_import_function
post_import_function(input_dict, instance, bulkload_state_copy) functionName
Your function must return one of the following: None, which means to skip importing this record; a single entity (usually the instance argument that was passed in); or a list of multiple entities to be imported.
When I comment out all code in my post_import_transform function and just write return None I still get the same error; however, this is contradictory to code.google.com.
http://code.google.com/appengine/docs/python/tools/uploadingdata.html
Glancing through the relevant code, looks like a bug.
You're post import function is run within
dict_to_entity
, which simply returns whatever your function returns.create_entity
feeds whateverdict_to_entity
returns into__track_max_id
, which doesn't seem to properly account for a list or None.I'd suggest you file this as a bug in the App Engine Issue tracker.
Note that you could fix this pretty easily in your local SDK. Basically change
__track_max_id
to look something like: