implement N-hot encoding in tf.slim

2019-07-26 00:26发布

How to implement N-hot encoding according to the index of 1 in a tf.int64 ? The input is tensor containing several tf.int64. The N-hot encoding is aimed to replace one-hot encoding in tf.slim.

The one_hot encoding is implemented as following:

def dense_to_one_hot(labels_dense, num_classes):
  """Convert class labels from scalars to one-hot vectors."""
  num_labels = labels_dense.shape[0]
  index_offset = numpy.arange(num_labels) * num_classes
  labels_one_hot = numpy.zeros((num_labels, num_classes))
  labels_one_hot.flat[index_offset + labels_dense.ravel()] = 1
  return labels_one_hot

The N-not encoding means: 19=00010011, the result after encoding is [0,0,0,1,0,0,1,1].

2条回答
相关推荐>>
2楼-- · 2019-07-26 01:16

This is one solution:

import numpy as np
import tensorflow as tf

def n_hot_encoding(a, n):
    a = tf.convert_to_tensor(a)
    m = 1 << np.arange(n)[::-1]
    shape = np.r_[np.ones(len(a.shape), dtype=int), -1]
    m = m.reshape(shape)
    hits = tf.bitwise.bitwise_and(a[..., tf.newaxis], tf.cast(m, a.dtype))
    return tf.not_equal(hits, 0)


with tf.Graph().as_default(), tf.Session() as sess:
    n_hot = n_hot_encoding([19, 20, 21], 10)
    print(sess.run(tf.cast(n_hot, tf.int32)))

Output:

[[0 0 0 0 0 1 0 0 1 1]
 [0 0 0 0 0 1 0 1 0 0]
 [0 0 0 0 0 1 0 1 0 1]]

It assumes that the N is a regular scalar (not a TensorFlow value) and that the number of dimensions of the array to convert is known (the size of each dimension can be dynamic, but a.shape should not be just None). The function can be adapted to TensorFlow-only computation like this:

import tensorflow as tf

def n_hot_encoding(a, n):
    a = tf.convert_to_tensor(a)
    n = tf.convert_to_tensor(n)
    m = tf.bitwise.left_shift(1, tf.range(n)[::-1])
    shape = tf.concat([tf.ones([tf.rank(a)], dtype=tf.int64), [-1]], axis=0)
    m = tf.reshape(m, shape)
    hits = tf.bitwise.bitwise_and(a[..., tf.newaxis], tf.cast(m, a.dtype))
    return tf.not_equal(hits, 0)

This should work with any input but may do a bit more of extra work on every graph run.

查看更多
聊天终结者
3楼-- · 2019-07-26 01:18

Find below an alternative to @jdehesa great answer. This version computes the bit length N itself (but works on single-valued tensors only - or tensors containing values of same bit length):

import tensorflow as tf

def logn(x, n):
  numerator = tf.log(x)
  denominator = tf.log(tf.cast(n, dtype=numerator.dtype))
  return numerator / denominator

def count_bits(x):
    return tf.cast((logn(tf.cast(x, dtype=tf.float32), 2)) + 1, dtype=x.dtype)

def n_hot_encode(x):
    """
    Unpack an integer into its variable-length bit representation
    :param x: Int tensor of shape ()
    :return:  Bool tensor of shape (N,) with N = bit length of x
    """
    N = count_bits(x)
    bins = tf.bitwise.left_shift(1, tf.range(N))[::-1]
    x_unpacked = tf.reshape(tf.bitwise.bitwise_and(x, bins), [-1])
    x_bits = tf.cast(x_unpacked, dtype=tf.bool)
    return x_bits

with tf.Session() as sess:
    result = sess.run(n_hot_encode(tf.constant(19)))
    print(result)
    # > [ True False False  True  True]
    result = sess.run(n_hot_encode(tf.constant(255)))
    print(result)
    # > [ True  True  True  True  True  True  True  True]

Previous Answer:

Using tf.one_hot():

labels_one_hot = tf.one_hot(labels_dense, num_classes)
查看更多
登录 后发表回答