When I run Dataflow job, it takes my small package (setup.py or requirements.txt) and uploads it to run on the Dataflow instances.
But what is actually running on the Dataflow instance? I got a stacktrace recently:
File "/usr/lib/python2.7/httplib.py", line 1073, in _send_request
self.endheaders(body)
File "/usr/lib/python2.7/httplib.py", line 1035, in endheaders
self._send_output(message_body)
File "/usr/lib/python2.7/httplib.py", line 877, in _send_output
msg += message_body
TypeError: must be str, not unicode
[while running 'write to datastore/Convert to Mutation']
But in theory, if I'm doing str += unicode
, it implies I might not be running this Python patch? Can you point to the docker image that these jobs are running, so I can know what version of Python I'm working with, and make sure I'm not barking up the wrong tree here?
The cloud console shows me the instance template, which seems to point to dataflow-dataflow-owned-resource-20170308-rc02, but it seems I don't have permission to look at it. Is the source for it online anywhere?