I found tf.contrib.layers.embed_sequence()
function in the lastest Tensorflow
examples, but it is not included in the main API . I don't know why. Any explanation about how it works would be appreciated.
相关问题
- batch_dot with variable batch size in Keras
- How to use Reshape keras layer with two None dimen
- How to use Reshape keras layer with two None dimen
- CV2 Image Error: error: (-215:Assertion failed) !s
- Why keras use “call” instead of __call__?
相关文章
- tensorflow 神经网络 训练集准确度远高于验证集和测试集准确度?
- Tensorflow: device CUDA:0 not supported by XLA ser
- Numpy array to TFrecord
- conditional graph in tensorflow and for loop that
- How to downgrade to cuda 10.0 in arch linux?
- Apply TensorFlow Transform to transform/scale feat
- How to force tensorflow tensors to be symmetric?
- How to measure overfitting when train and validati
I can think of two main reasons why
tensorflow.contrib.layers.embed_sequence
is useful:tensorflow.contrib.layers.embed_sequence
, you can reduce the number of parameters in your network while preserving depth. For example, it eliminates the need for each gates of the LSTM to perform its own linear projection of features.Let us say that I have a data set which looks something like this:
[("garbage piles in the city","Garbage"), ("city is clogged with vehicles","Traffic")]
I want to take the first element of each tuple which is a sequence of words. The words need to be embedded in a vector form. As the first step, they should be converted as indices or numbers. For example, in this case, the vocabulary will be:
The encoded text will look like this:
You pass this encoded text as
features
to this function in batches:Now, every word which is represented using the indices (1 to 5), becomes embedded into a vector of size
EMBEDDING_SIZE
.If the batch size is 2 (ie. 2 sequences in one batch) and
EMBEDDING_SIZE
is 10, the output will be a matrix of shape(2, 5, 10)
Sample output:
sent2
is encoded similarly ( 5 x 10 matrix).Hope this is clear.
From TF docoumentaion: