可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效，请关闭广告屏蔽插件后再试):

问题:

I have a dataset of 65668 files.

I am using Keras for a CNN, and these are my layers:

embedding_layer = Embedding(len(word_index) + 1,
                        EMBEDDING_DIM,
                        weights=[embedding_matrix],
                        input_length=MAX_SEQUENCE_LENGTH,
                        trainable=True)
sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = Conv1D(128, 5, activation='relu')(embedded_sequences)
x = MaxPooling1D(5)(x)
x = Conv1D(256, 5, activation='relu')(x)
x = MaxPooling1D(5)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
preds = Dense(len(labels_index), activation='softmax')(x)

First embedding layer is trained on GloVE.6B.100d. Fitting the data:

# fitting the data
model.fit(x_train, y_train, validation_data=(x_val, y_val),
      epochs=20, batch_size=128)

The MAX_SEQUENCE_LENGTH is 500. I am training on the GPU, Nvidia GeForce 940MX, and I get the following error as part of the stack :

Resource exhausted: OOM when allocating tensor with shape[15318793,100] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

I tried reducing batch size to 16, even 8 and I still get the same error. What can the issue be?

回答1:

The problem lies in your Embedding. It needs to allocate a matrix of size 15318793 * 100 * 4 bytes = 5.7 GB which is definitely greater than GeForce 940 MX memory. There are few ways on how you could overcome this issue:

Decrease the vocabulary/corpus size: Try to take e.g. 1M most frequent words instead of full words set. This will drastically decrease the embedding matrix size.
Use generators instead of Embedding: Instead of using Embedding you could use a generator to transform your sequences into word vectors sequences.
Use linear transformation of Embedding instead of retraining your embedding - as you mentioned that with flag trainable=False made your algorithm working you can set it to False and add:
```
Dense(new_embedding_size, activation='linear')(embedding)
```
to train a new embedding based on existing one.

Change device - if you have huge RAM memory you can try the following strategy:

with tf.device('/cpu:0'):    
    embedding_layer = Embedding(len(word_index) + 1,
        EMBEDDING_DIM,
        weights=[embedding_matrix],
        input_length=MAX_SEQUENCE_LENGTH,
        trainable=True)
    sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
    embedded_sequences = embedding_layer(sequence_input)

In this design computations of Embedding layer would be made using CPU and RAM. The downside is the fact that transfer between RAM and GPU might be really slow.

回答2:

It is very unlikely that you have a dataset large enough to cover all the words in the GloVE embeddings, so setting the embedding to be trainable might cover only some percent of the embedding, so when you train it, those embeddings will move to a slightly different space, but the untouched ones will remain in the original GloVE space. Try setting trainable=False and fix the problem by performing a linear transformation such as:

sequence_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32')
embedded_sequences = embedding_layer(sequence_input)
x = TimeDistributed(Dense(EMBEDDING_DIM))(embedded_sequences)
x = Conv1D(128, 5, activation='relu')(x)

as another commenter said. This is important because if you use this for inference in production, one of the untouched embeddings might make the model go quite nuts. The linear transformation moves the embedding space around and this will work in such a way that it ideally should perform acceptably to unseen data.