Keras - Autoencoder for Text Analysis

2020-03-06 19:48发布

问题:

So I'm trying to create an autoencoder that will take text reviews and find a lower dimensional representation. I'm using keras and I want my loss function to compare the output of the AE to the output of the embedding layer. Unfortunately, it gives me the following error. I'm pretty sure the problem is with my loss function but I can't seem to resolve the issue.

Autoencoder

print X_train.shape
input_i = Input(shape=(200,))
embedding = Embedding(input_dim=weights.shape[0],output_dim=weights.shape[1],
                      weights=[weights])(input_i)
encoded_h1 = Dense(64, activation='tanh')(embedding)
encoded_h2 = Dense(32, activation='tanh')(encoded_h1)
encoded_h3 = Dense(16, activation='tanh')(encoded_h2)
encoded_h4 = Dense(8, activation='tanh')(encoded_h3)
encoded_h5 = Dense(4, activation='tanh')(encoded_h4)
latent = Dense(2, activation='tanh')(encoded_h5)
decoder_h1 = Dense(4, activation='tanh')(latent)
decoder_h2 = Dense(8, activation='tanh')(decoder_h1)
decoder_h3 = Dense(16, activation='tanh')(decoder_h2)
decoder_h4 = Dense(32, activation='tanh')(decoder_h3)
decoder_h5 = Dense(64, activation='tanh')(decoder_h4)

output = Dense(weights.shape[1], activation='tanh')(decoder_h5)

autoencoder = Model(input_i,output)
encoder = Model(input_i,latent)

print autoencoder.summary()

import keras.backend as K
import tensorflow as tf
def embedded_mse(x_true, e_pred):
    print output
    print embedding
    mse = K.mean(K.square(output - embedding))
    print mse

    return tf.Session().run(mse)
autoencoder.compile(optimizer='adadelta',
                    loss=embedded_mse)
autoencoder.fit(X_train,X_train,epochs=10,
                batch_size=256, validation_split=.1)

Output

(100000, 200)
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_47 (InputLayer)        (None, 200)               0         
_________________________________________________________________
embedding_31 (Embedding)     (None, 200, 100)          21833700  
_________________________________________________________________
dense_528 (Dense)            (None, 200, 64)           6464      
_________________________________________________________________
dense_529 (Dense)            (None, 200, 32)           2080      
_________________________________________________________________
dense_530 (Dense)            (None, 200, 16)           528       
_________________________________________________________________
dense_531 (Dense)            (None, 200, 8)            136       
_________________________________________________________________
dense_532 (Dense)            (None, 200, 4)            36        
_________________________________________________________________
dense_533 (Dense)            (None, 200, 2)            10        
_________________________________________________________________
dense_534 (Dense)            (None, 200, 4)            12        
_________________________________________________________________
dense_535 (Dense)            (None, 200, 8)            40        
_________________________________________________________________
dense_536 (Dense)            (None, 200, 16)           144       
_________________________________________________________________
dense_537 (Dense)            (None, 200, 32)           544       
_________________________________________________________________
dense_538 (Dense)            (None, 200, 64)           2112      
_________________________________________________________________
dense_539 (Dense)            (None, 200, 100)          6500      
=================================================================
Total params: 21,852,306
Trainable params: 21,852,306
Non-trainable params: 0
_________________________________________________________________
None
Tensor("dense_539/Tanh:0", shape=(?, 200, 100), dtype=float32)
Tensor("embedding_31/Gather:0", shape=(?, 200, 100), dtype=float32)
Tensor("loss_48/dense_539_loss/Mean:0", shape=(), dtype=float32)

Error

---------------------------------------------------------------------------
InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-155-a18e0c32f59b> in <module>()
      1 autoencoder.compile(optimizer='adadelta',
----> 2                     loss=embedded_mse)
      3 autoencoder.fit(X_train,embedding,epochs=10,
      4                 batch_size=256, validation_split=.1)

/home/andrew/.local/lib/python2.7/site-packages/keras/engine/training.pyc in compile(self, optimizer, loss, metrics, loss_weights, sample_weight_mode, weighted_metrics, target_tensors, **kwargs)
    848                 with K.name_scope(self.output_names[i] + '_loss'):
    849                     output_loss = weighted_loss(y_true, y_pred,
--> 850                                                 sample_weight, mask)
    851                 if len(self.outputs) > 1:
    852                     self.metrics_tensors.append(output_loss)

/home/andrew/.local/lib/python2.7/site-packages/keras/engine/training.pyc in weighted(y_true, y_pred, weights, mask)
    448         """
    449         # score_array has ndim >= 2
--> 450         score_array = fn(y_true, y_pred)
    451         if mask is not None:
    452             # Cast the mask to floatX to avoid float64 upcasting in theano

<ipython-input-153-73211fc383a5> in embedded_mse(x_true, e_pred)
      7     print mse
      8 
----> 9     return tf.Session().run(mse)

/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in run(self, fetches, feed_dict, options, run_metadata)
    893     try:
    894       result = self._run(None, fetches, feed_dict, options_ptr,
--> 895                          run_metadata_ptr)
    896       if run_metadata:
    897         proto_data = tf_session.TF_GetBuffer(run_metadata_ptr)

/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _run(self, handle, fetches, feed_dict, options, run_metadata)
   1122     if final_fetches or final_targets or (handle and feed_dict_tensor):
   1123       results = self._do_run(handle, final_targets, final_fetches,
-> 1124                              feed_dict_tensor, options, run_metadata)
   1125     else:
   1126       results = []

/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_run(self, handle, target_list, fetch_list, feed_dict, options, run_metadata)
   1319     if handle is None:
   1320       return self._do_call(_run_fn, self._session, feeds, fetches, targets,
-> 1321                            options, run_metadata)
   1322     else:
   1323       return self._do_call(_prun_fn, self._session, handle, feeds, fetches)

/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/client/session.pyc in _do_call(self, fn, *args)
   1338         except KeyError:
   1339           pass
-> 1340       raise type(e)(node_def, op, message)
   1341 
   1342   def _extend_graph(self):

InvalidArgumentError: You must feed a value for placeholder tensor 'input_47' with dtype float and shape [?,200]
     [[Node: input_47 = Placeholder[dtype=DT_FLOAT, shape=[?,200], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

Caused by op u'input_47', defined at:
  File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
    "__main__", fname, loader, pkg_name)
  File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
    exec code in run_globals
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel_launcher.py", line 16, in <module>
    app.launch_new_instance()
  File "/home/andrew/.local/lib/python2.7/site-packages/traitlets/config/application.py", line 658, in launch_instance
    app.start()
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/kernelapp.py", line 477, in start
    ioloop.IOLoop.instance().start()
  File "/home/andrew/.local/lib/python2.7/site-packages/zmq/eventloop/ioloop.py", line 177, in start
    super(ZMQIOLoop, self).start()
  File "/home/andrew/.local/lib/python2.7/site-packages/tornado/ioloop.py", line 888, in start
    handler_func(fd_obj, events)
  File "/home/andrew/.local/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/andrew/.local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 440, in _handle_events
    self._handle_recv()
  File "/home/andrew/.local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 472, in _handle_recv
    self._run_callback(callback, msg)
  File "/home/andrew/.local/lib/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 414, in _run_callback
    callback(*args, **kwargs)
  File "/home/andrew/.local/lib/python2.7/site-packages/tornado/stack_context.py", line 277, in null_wrapper
    return fn(*args, **kwargs)
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 283, in dispatcher
    return self.dispatch_shell(stream, msg)
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 235, in dispatch_shell
    handler(stream, idents, msg)
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/kernelbase.py", line 399, in execute_request
    user_expressions, allow_stdin)
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/ipkernel.py", line 196, in do_execute
    res = shell.run_cell(code, store_history=store_history, silent=silent)
  File "/home/andrew/.local/lib/python2.7/site-packages/ipykernel/zmqshell.py", line 533, in run_cell
    return super(ZMQInteractiveShell, self).run_cell(*args, **kwargs)
  File "/home/andrew/.local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2718, in run_cell
    interactivity=interactivity, compiler=compiler, result=result)
  File "/home/andrew/.local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2822, in run_ast_nodes
    if self.run_code(code, result):
  File "/home/andrew/.local/lib/python2.7/site-packages/IPython/core/interactiveshell.py", line 2882, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-152-7732fda181fc>", line 2, in <module>
    input_i = Input(shape=(200,))
  File "/home/andrew/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1436, in Input
    input_tensor=tensor)
  File "/home/andrew/.local/lib/python2.7/site-packages/keras/legacy/interfaces.py", line 87, in wrapper
    return func(*args, **kwargs)
  File "/home/andrew/.local/lib/python2.7/site-packages/keras/engine/topology.py", line 1347, in __init__
    name=self.name)
  File "/home/andrew/.local/lib/python2.7/site-packages/keras/backend/tensorflow_backend.py", line 442, in placeholder
    x = tf.placeholder(dtype, shape=shape, name=name)
  File "/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/ops/array_ops.py", line 1548, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/ops/gen_array_ops.py", line 2094, in _placeholder
    name=name)
  File "/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/framework/op_def_library.py", line 767, in apply_op
    op_def=op_def)
  File "/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 2630, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/andrew/.local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1204, in __init__
    self._traceback = self._graph._extract_stack()  # pylint: disable=protected-access

InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'input_47' with dtype float and shape [?,200]
     [[Node: input_47 = Placeholder[dtype=DT_FLOAT, shape=[?,200], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]

回答1:

There are some issues with your question (e.g. what is weights, used in the Embedding & final Dense layers arguments?). Still, I think a simpler approach would be to disentangle the embedding and the autoencoding parts (they are independent), by building first a simple embedding model and then use its outputs (with predict) to feed your autoencoder. This way you don't have to define a custom loss (BTW, print statements in such functions are not a good idea).

Without knowing the details of your data, the following 2 models compile OK:

Embedding model (quick adaptation from the docs)

model = Sequential()
model.add(Embedding(1000, 64))
model.compile('rmsprop', 'mse')

Autoencoder:

input_i = Input(shape=(200,100))
encoded_h1 = Dense(64, activation='tanh')(input_i)
encoded_h2 = Dense(32, activation='tanh')(encoded_h1)
encoded_h3 = Dense(16, activation='tanh')(encoded_h2)
encoded_h4 = Dense(8, activation='tanh')(encoded_h3)
encoded_h5 = Dense(4, activation='tanh')(encoded_h4)
latent = Dense(2, activation='tanh')(encoded_h5)
decoder_h1 = Dense(4, activation='tanh')(latent)
decoder_h2 = Dense(8, activation='tanh')(decoder_h1)
decoder_h3 = Dense(16, activation='tanh')(decoder_h2)
decoder_h4 = Dense(32, activation='tanh')(decoder_h3)
decoder_h5 = Dense(64, activation='tanh')(decoder_h4)

output = Dense(100, activation='tanh')(decoder_h5)

autoencoder = Model(input_i,output)

autoencoder.compile('adadelta','mse')

After adapting the above models parameters to your case, this should work fine:

X_embedded = model.predict(X_train)
autoencoder.fit(X_embedded,X_embedded,epochs=10,
            batch_size=256, validation_split=.1)