I've recently reviewed an interesting implementation for convolutional text classification. However all TensorFlow code I've reviewed uses a random (not pre-trained) embedding vectors like the following:
with tf.device('/cpu:0'), tf.name_scope("embedding"):
W = tf.Variable(
tf.random_uniform([vocab_size, embedding_size], -1.0, 1.0),
name="W")
self.embedded_chars = tf.nn.embedding_lookup(W, self.input_x)
self.embedded_chars_expanded = tf.expand_dims(self.embedded_chars, -1)
Does anybody know how to use the results of Word2vec or a GloVe pre-trained word embedding instead of a random one?
I use this method to load and share embedding.
The answer of @mrry is not right because it provoques the overwriting of the embeddings weights each the network is run, so if you are following a minibatch approach to train your network, you are overwriting the weights of the embeddings. So, on my point of view the right way to pre-trained embeddings is:
I was also facing embedding issue, So i wrote detailed tutorial with dataset. Here I would like to add what I tried You can also try this method,
Here is working detailed Tutorial Ipython example if you want to understand from scratch , take a look .
There are a few ways that you can use a pre-trained embedding in TensorFlow. Let's say that you have the embedding in a NumPy array called
embedding
, withvocab_size
rows andembedding_dim
columns and you want to create a tensorW
that can be used in a call totf.nn.embedding_lookup()
.Simply create
W
as atf.constant()
that takesembedding
as its value:This is the easiest approach, but it is not memory efficient because the value of a
tf.constant()
is stored multiple times in memory. Sinceembedding
can be very large, you should only use this approach for toy examples.Create
W
as atf.Variable
and initialize it from the NumPy array via atf.placeholder()
:This avoid storing a copy of
embedding
in the graph, but it does require enough memory to keep two copies of the matrix in memory at once (one for the NumPy array, and one for thetf.Variable
). Note that I've assumed that you want to hold the embedding matrix constant during training, soW
is created withtrainable=False
.If the embedding was trained as part of another TensorFlow model, you can use a
tf.train.Saver
to load the value from the other model's checkpoint file. This means that the embedding matrix can bypass Python altogether. CreateW
as in option 2, then do the following: