Have someone tried doing multitask deep learning with TensorFlow? That is, sharing the bottom layers while not sharing the top layers. An example with simple illustration would help a lot.
问题:
回答1:
There is an similar question here, the answer used keras.
It's similar when just using tensorflow. The idea is this: we can define multiple outputs of a network, and thus multiple loss functions (objectives). We then tell optimizer to minimize a combined loss function, usually using a linear combination.
A concept diagram
This diagram is drawn according to this paper.
Let's say we are training a classifier that predict the digit in the image, with maximum 5 digits per image. Here we defined 6 output layer: digit1
, digit2
, digit3
, digit4
, digit5
, length
. The digit
layer should output 0~9 if there is such a digit, or X
(substitute it with an real number in practice) if there isn't any digit in its position. Same thing for length
, it should output 0~5 if the image contains 0~5 digit, or X
if it contains more than 5 digits.
Now to train it, we just add up all the cross entropy loss of each softmax function:
# Define loss and optimizer
lossLength = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(length_logits, true_length)), 1e-37, 1e+37))
lossDigit1 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit1_logits, true_digit1)), 1e-37, 1e+37))
lossDigit2 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit2_logits, true_digit2)), 1e-37, 1e+37))
lossDigit3 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit3_logits, true_digit3)), 1e-37, 1e+37))
lossDigit4 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit4_logits, true_digit4)), 1e-37, 1e+37))
lossDigit5 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit5_logits, true_digit5)), 1e-37, 1e+37))
cost = tf.add(
tf.add(
tf.add(
tf.add(
tf.add(cL,lossDigit1),
lossDigit2),
lossDigit3),
lossDigit4),
lossDigit5)
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)