可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I am new to tf and I have a problem where I am trying to process some files. Here is a excerpt of code.

xlabel_to_files_list_map['dog_bark'] # subset of data with two files

# result
['gs://some_bucket/some_dir/data/dog_bark/100652.mp3', 'gs://some_bucket/some_dir/dog_bark/100795.mp3']

Here is where I am simply trying to get these strings to be processed by a simple graph:

file_to_process = tf.placeholder(tf.string)

audio_binary_remote = tf.gfile.Open(file_to_process, 'rb').read()

waveform = tf.contrib.ffmpeg.decode_audio(audio_binary_remote, file_format='mp3', samples_per_second=44100, channel_count=2)


with tf.Session() as sess:
 result = sess.run(waveform, feed_dict={
 file_to_process: xlabel_to_files_list_map['dog_bark']
 })
#print (result)

This results in

TypeError: Expected binary or unicode string, got <tf.Tensor 'Placeholder_9:0' shape=<unknown> dtype=string>

FWIW, this works

a_string = tf.placeholder(tf.string) 
z = a_string 
with tf.Session() as sess: 
    result = sess.run(z, feed_dict={a_string: ['one', 'two', 'three']}) 
print(result)

This results in

['one' 'two' 'three']

The simple example which works is of a list of strings. The more complex example which uses a hash map value part which is a list of strings. I'm not sure why it doesn't work similar to the second example.

Another Approach

I tried to do this another way. This time I tried to build a list of results and then process the list. This also failed. It did not produce an error. It simply gave blank results.

waveform_tensor_list = []
for a_file in dir_to_selected_files_list_map['gs://some_bucket/some_dir/dog_bark/']:
  #print (a_file)
  waveform = tf.contrib.ffmpeg.decode_audio(a_file, file_format='mp3', samples_per_second=44100, channel_count=2)
  waveform_tensor_list.append(waveform)

The output from this cell looks wrong immediately but in the proper form:

waveform_tensor_list

resulting in:

[<tf.Tensor 'DecodeAudioV2_7:0' shape=(?, 2) dtype=float32>,
 <tf.Tensor 'DecodeAudioV2_8:0' shape=(?, 2) dtype=float32>,
 stuff deleted,
 <tf.Tensor 'DecodeAudioV2_22:0' shape=(?, 2) dtype=float32>,
 <tf.Tensor 'DecodeAudioV2_23:0' shape=(?, 2) dtype=float32>]

Then to evaluate the graph, this:

with tf.Session() as sess:
  result = sess.run(waveform_tensor_list)
  print (result)

Where the output from this cell is:

[array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32), array([], shape=(0, 0), dtype=float32)]

回答1:

tf.gfile.Open isn't a TensorFlow operation. In other words, it does not add operations to the graph to open the file.

tf.gfile.Open is an alias for the class tf.gfile.GFile. So the line tf.gfile.Open(<foo>) is invoking tf.gfile.GFile.__init__ which expects the first argument to be a Python string, not tf.Tensor of strings (which is what tf.placeholder(tf.string) returns).

You have a few options here:

Feed the contents of the file

raw_data = tf.placeholder(tf.string)
waveform = tf.contrib.ffmpeg.decode_audio(raw_data, file_format='mp3', samples_per_second=44100, channel_count=2)

with tf.Session() as sess:
    for file in xlabel_to_files_list_map['dog_bark']:
      result = sess.run(waveform, feed_dict={raw_data: tf.gfile.GFile(file, 'rb').read()})

Open and read the file in the graph

(Using the tf.data classes to setup "input processing")

filenames = xlabel_to_files_list_map['dog_bark']
dataset = tf.data.Dataset.from_tensor_slices(filenames).map(lambda x: tf.read_file(x))

raw_data = dataset.make_one_shot_iterator().get_next()
waveform =  tf.contrib.ffmpeg.decode_audio(raw_data, file_format='mp3', samples_per_second=44100, channel_count=2)

with tf.Session() as sess:
    for _ in filenames:
        result = sess.run(waveform)

Use eager execution

(See the Research and Experimentation section of the TensorFlow getting started guide)

Which may help reduce some of the confusion between what's in the graph and what's happening in Python.

tf.enable_eager_execution()
for filename in xlabel_to_files_list_map['dog_bark']:
    result = tf.contrib.ffmpeg.decode_audio(tf.gfile.GFile(filename, 'rb').read(), file_format='mp3', samples_per_second=44100, channel_count=2)

Hope that helps!

回答2:

I've not used tensor flow, but per the documentation tf.gfile.Open creates a GFile object. tf.contrib.ffmpeg.decode_audio is expecting a binary out unicode string. Since GFile does have a read() function I would try using the flowing:

waveform = tf.contrib.ffmpeg.decode_audio(audio_binary_remote.read(), file_format='mp3', samples_per_second=44100, channel_count=2)