How to read a utf-8 encoded binary string in tenso

I am trying to convert an encoded byte string back into the original array in the tensorflow graph (using tensorflow operations) in order to make a prediction in a tensorflow model. The array to byte conversion is based on this answer and it is the suggested input to tensorflow model prediction on google cloud's ml-engine.

def array_request_example(input_array):
    input_array = input_array.astype(np.float32)
    byte_string = input_array.tostring()
    string_encoded_contents = base64.b64encode(byte_string)
    return string_encoded_contents.decode('utf-8')}

Tensorflow code

byte_string = tf.placeholder(dtype=tf.string)
audio_samples = tf.decode_raw(byte_string, tf.float32)

audio_array = np.array([1, 2, 3, 4])
bstring = array_request_example(audio_array)
fdict = {byte_string: bstring}
with tf.Session() as sess:
    [tf_samples] = sess.run([audio_samples], feed_dict=fdict)

I have tried using decode_raw and decode_base64 but neither return the original values.

I have tried setting the the out_type of decode raw to the different possible datatypes and tried altering what data type I am converting the original array to.

So, how would I read the byte array in tensorflow? Thanks :)

Extra Info

The aim behind this is to create the serving input function for a custom Estimator to make predictions using gcloud ml-engine local predict (for testing) and using the REST API for the model stored on the cloud.

The serving input function for the Estimator is

def serving_input_fn():
    feature_placeholders = {'b64': tf.placeholder(dtype=tf.string,
                                                  shape=[None],
                                                  name='source')}
    audio_samples = tf.decode_raw(feature_placeholders['b64'], tf.float32)
    # Dummy function to save space
    power_spectrogram = create_spectrogram_from_audio(audio_samples)
    inputs = {'spectrogram': power_spectrogram}
    return tf.estimator.export.ServingInputReceiver(inputs, feature_placeholders)

Json request

I use .decode('utf-8') because when attempting to json dump the base64 encoded byte strings I receive this error

raise TypeError(repr(o) + " is not JSON serializable")
TypeError: b'longbytestring'

Prediction Errors

When passing the json request {'audio_bytes': 'b64': bytestring} with gcloud local I get the error

PredictionError: Invalid inputs: Expected tensor name: b64, got tensor name: [u'audio_bytes']

So perhaps google cloud local predict does not automatically handle the audio bytes and base64 conversion? Or likely somethings wrong with my Estimator setup.

And the request {'instances': [{'audio_bytes': 'b64': bytestring}]} to REST API gives

{'error': 'Prediction failed: Error during model execution: AbortionError(code=StatusCode.INVALID_ARGUMENT, details="Input to DecodeRaw has length 793713 that is not a multiple of 4, the size of float\n\t [[Node: DecodeRaw = DecodeRaw[_output_shapes=[[?,?]], little_endian=true, out_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_source_0_0)]]")'}

which confuses me as I explicitly define the request to be a float and do the same in the serving input receiver.

Removing audio_bytes from the request and utf-8 encoding the byte strings allows me to get predictions, though in testing the decoding locally, I think the audio is being incorrectly converted from the byte string.

The answer that you referenced, is written assuming you are running the model on CloudML Engine's service. The service actually takes care of the JSON (including UTF-8) and base64 encoding.

To get your code working locally or in another environment, you'll need the following changes:

def array_request_example(input_array):
    input_array = input_array.astype(np.float32)
    return input_array.tostring()

byte_string = tf.placeholder(dtype=tf.string)
audio_samples = tf.decode_raw(byte_string, tf.float32)

audio_array = np.array([1, 2, 3, 4])
bstring = array_request_example(audio_array)
fdict = {byte_string: bstring}
with tf.Session() as sess:
    tf_samples = sess.run([audio_samples], feed_dict=fdict)

That said, based on your code, I suspect you are looking to send data as JSON; you can use gcloud local predict to simulate CloudML Engine's service. Or, if you prefer to write your own code, perhaps something like this:

def array_request_examples,(input_arrays):
  """input_arrays is a list (batch) of np_arrays)"""
  input_arrays = (a.astype(np.float32) for a in input_arrays)
  # Convert each image to byte strings
  bytes_strings = (a.tostring() for a in input_arrays)
  # Base64 encode the data
  encoded = (base64.b64encode(b) for b in bytes_strings)
  # Create a list of images suitable to send to the service as JSON:
  instances = [{'audio_bytes': {'b64': e}} for e in encoded]
  # Create a JSON request
  return json.dumps({'instances': instances})

def parse_request(request):
  # non-TF to simulate the CloudML Service which does not expect
  # this to be in the submitted graphs.
  instances = json.loads(request)['instances']
  return [base64.b64decode(i['audio_bytes']['b64']) for i in instances]

byte_strings = tf.placeholder(dtype=tf.string, shape=[None])
decode = lambda raw_byte_str: tf.decode_raw(raw_byte_str, tf.float32)
audio_samples = tf.map_fn(decode, byte_strings, dtype=tf.float32)

audio_array = np.array([1, 2, 3, 4])
request = array_request_examples([audio_array])
fdict = {byte_strings: parse_request(request)}
with tf.Session() as sess:
  tf_samples = sess.run([audio_samples], feed_dict=fdict)