I'm currently working on reworking Professor Andrew Ng's "Machine Learning" course assignments from Coursera, and I got stuck in the Logistic Regression portion.
filename = 'data/ex2data1.txt'
data = np.loadtxt(filename, delimiter = ",", unpack = True)
# Data matrices
xtr = np.transpose(np.array(data[:-1]))
ytr = np.transpose(np.array(data[-1:]))
# Initial weights
W = tf.Variable(tf.zeros([2,1], dtype = tf.float64))
# Bias
b = tf.Variable(tf.zeros([1], dtype = tf.float64))
# Cost function
y_ = tf.nn.sigmoid(tf.matmul(xtr,W) + b)
cost = -tf.reduce_mean(ytr*tf.log(y_) + (1-ytr)*tf.log(1-y_))
optimize = tf.train.GradientDescentOptimizer(0.01).minimize(cost)
corr = tf.equal(tf.argmax(ytr,1), tf.argmax(y_,1))
acc = tf.reduce_mean(tf.cast(corr, tf.float64))
init = tf.initialize_all_variables()
with tf.Session() as sess:
sess.run(init)
print(sess.run(cost))
for _ in range(3):
sess.run(optimize)
print(sess.run(cost))
This produces the answer:
0.69314718056
nan
nan
nan
The first result to the cost function is correct, but the next ones are supposed to be:
3.0133
1.5207
0.7336
and instead I get a bunch of NaN's. I've tried lower learning rates, all to no avail. What am I doing wrong? And is it possible to reproduce this assignment in TensorFlow?
PS: Other python solutions seem to be using scipy.optimize but I have no idea how I would use that with TensorFlow values, and I would like to use only TensorFlow if at all possible.
EDIT: I've also tried putting bias as tf.ones instead of tf.zeros, but it also didn't work.
Your logarithm isn't sanitizing the input. It might very well happen that you have negative input values that quickly NaN any floating number arithmetic.
What I did in Java code that makes heavy use of logs in similar domain:
In Java that code looks like this, shouldn't be difficult to translate to tf: