I'm using a neural network for classification of a set of objects into n-classes. Each object can belong to multiple classes at the same time (multi-class, multi-label).
For example :
Input is ['a','b','c'] and labels are [1,0,0,1,1,1] # where 1 means it is in class and 0 means it's not in class it's multi labels classification
I search and found people are suggesting you should use sigmoid instead of softmax in Multi-label problems.
some references are :
Sentiment and genres classification
reddit group suggestion on multi label classification
Excellent blog on explaining the theory of multi label classification
another good tutorial on multi label
Now after reading those things , I got it that i should go for sigmoid butI have few confusions :
so for single classification what i was doing :
After model (hiddenlayers with lstm ) i was building fully connected layer :
#weights
Wo = tf.get_variable('Wo',
shape=[hdim*2, num_labels],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(-0.01, 0.01))
#bias
bo = tf.get_variable('bo',
shape=[num_labels,],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(-0.01, 0.01))
#logits
logits = tf.matmul(tf.concat([fsf.c, fsb.c], axis=-1), Wo) + bo
#probability
probs = tf.nn.softmax(logits)
#taking max probability
preds = tf.argmax(probs, axis=-1)
# Cross Entropy
ce = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=logits, labels=labels)
#loss
loss = tf.reduce_mean(ce)
# Accuracy
accuracy = tf.reduce_mean(
tf.cast(
tf.equal(tf.cast(preds, tf.int32), labels),
tf.float32
)
#training )
trainop = tf.train.AdamOptimizer().minimize(loss)
But In multi-labels how should i take inputs :
so what i tried :
#weights
Wo = tf.get_variable('Wo',
shape=[hdim*2, num_labels],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(-0.01, 0.01))
#bias
bo = tf.get_variable('bo',
shape=[num_labels,],
dtype=tf.float32,
initializer=tf.random_uniform_initializer(-0.01, 0.01))
#logits
logits = tf.matmul(tf.concat([fsf.c, fsb.c], axis=-1), Wo) + bo
#probability
probs = tf.nn.sigmoid(logits)
#taking max probability
preds = tf.round(probs)
# Cross Entropy
ce = tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels=labels)
#loss
loss = tf.reduce_mean(ce)
# Accuracy
accuracy = tf.reduce_mean(
tf.cast(
tf.equal(tf.cast(preds, tf.int32), labels),
tf.float32
)
#training )
trainop = tf.train.AdamOptimizer().minimize(loss)
Is this right approach , AM i doing right or need correction somewhere ?
If i passing to tf.nn.sigmoid should i need to pass to tf.nn.sigmoid_cross_entropy_with_logits or not ??
Thanks You .