Multitask deep learning with Tensorflow

2019-02-07 04:20发布

问题:

Have someone tried doing multitask deep learning with TensorFlow? That is, sharing the bottom layers while not sharing the top layers. An example with simple illustration would help a lot.

回答1:

There is an similar question here, the answer used keras.

It's similar when just using tensorflow. The idea is this: we can define multiple outputs of a network, and thus multiple loss functions (objectives). We then tell optimizer to minimize a combined loss function, usually using a linear combination.

A concept diagram

This diagram is drawn according to this paper.

Let's say we are training a classifier that predict the digit in the image, with maximum 5 digits per image. Here we defined 6 output layer: digit1, digit2, digit3, digit4, digit5, length. The digit layer should output 0~9 if there is such a digit, or X(substitute it with an real number in practice) if there isn't any digit in its position. Same thing for length, it should output 0~5 if the image contains 0~5 digit, or X if it contains more than 5 digits.

Now to train it, we just add up all the cross entropy loss of each softmax function:

# Define loss and optimizer
lossLength = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(length_logits, true_length)), 1e-37, 1e+37))
lossDigit1 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit1_logits, true_digit1)), 1e-37, 1e+37))
lossDigit2 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit2_logits, true_digit2)), 1e-37, 1e+37))
lossDigit3 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit3_logits, true_digit3)), 1e-37, 1e+37))
lossDigit4 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit4_logits, true_digit4)), 1e-37, 1e+37))
lossDigit5 = tf.log(tf.clip_by_value(tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(digit5_logits, true_digit5)), 1e-37, 1e+37))

cost = tf.add(
        tf.add(
        tf.add(
        tf.add(
        tf.add(cL,lossDigit1),
        lossDigit2),
        lossDigit3),
        lossDigit4),
        lossDigit5)

optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)