Problems with my RBF Network in Tensorflow?

2020-06-04 15:41发布

I'm working on an RBF network using Tensorflow, but there's this error that comes up at line 112 that says this: ValueError: Cannot feed value of shape (40, 13) for Tensor 'Placeholder:0', which has shape '(?, 12)'

Here's my code below. I created my own activation function for my RBF network by following this tutorial. Also, if there is anything else you notice that needs to be fixed, please point it out to me, because I am very new to Tensorflow so it would be helpful to get any feedback I can get.

import tensorflow as tf
import numpy as np
import math
from sklearn import datasets
from sklearn.model_selection import train_test_split
from tensorflow.python.framework import ops
ops.reset_default_graph()

RANDOM_SEED = 42
tf.set_random_seed(RANDOM_SEED)

boston = datasets.load_boston()

data = boston["data"]
target = boston["target"]

N_INSTANCES = data.shape[0]
N_INPUT = data.shape[1] - 1
N_CLASSES = 3
TEST_SIZE = 0.1
TRAIN_SIZE = int(N_INSTANCES * (1 - TEST_SIZE))
batch_size = 40
training_epochs = 400
learning_rate = 0.001
display_step = 20
hidden_size = 200

target_ = np.zeros((N_INSTANCES, N_CLASSES))

data_train, data_test, target_train, target_test = train_test_split(data, target_, test_size=0.1, random_state=100)

x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)
y_target = tf.placeholder(shape=[None, N_CLASSES], dtype=tf.float32)

# creates activation function
def gaussian_function(input_layer):
    initial = math.exp(-2*math.pow(input_layer, 2))
    return initial

np_gaussian_function = np.vectorize(gaussian_function)

def d_gaussian_function(input_layer):
    initial = -4 * input_layer * math.exp(-2*math.pow(input_layer, 2))
    return initial

np_d_gaussian_function = np.vectorize(d_gaussian_function)

np_d_gaussian_function_32 = lambda input_layer: np_d_gaussian_function(input_layer).astype(np.float32)

def tf_d_gaussian_function(input_layer, name=None):
    with ops.name_scope(name, "d_gaussian_function", [input_layer]) as name:
        y = tf.py_func(np_d_gaussian_function_32, [input_layer],[tf.float32], name=name, stateful=False)
    return y[0]

def py_func(func, inp, Tout, stateful=True, name=None, grad=None):
    rnd_name = 'PyFunGrad' + str(np.random.randint(0, 1E+8))

    tf.RegisterGradient(rnd_name)(grad)
    g = tf.get_default_graph()
    with g.gradient_override_map({"PyFunc": rnd_name}):
        return tf.py_func(func, inp, Tout, stateful=stateful, name=name)

def gaussian_function_grad(op, grad):
    input_variable = op.inputs[0]
    n_gr = tf_d_gaussian_function(input_variable)
    return grad * n_gr

np_gaussian_function_32 = lambda input_layer: np_gaussian_function(input_layer).astype(np.float32)

def tf_gaussian_function(input_layer, name=None):
    with ops.name_scope(name, "gaussian_function", [input_layer]) as name:
        y = py_func(np_gaussian_function_32, [input_layer], [tf.float32], name=name, grad=gaussian_function_grad)
    return y[0]
# end of defining activation function

def rbf_network(input_layer, weights):
    layer1 = tf.matmul(tf_gaussian_function(input_layer), weights['h1'])
    layer2 = tf.matmul(tf_gaussian_function(layer1), weights['h2'])
    output = tf.matmul(tf_gaussian_function(layer2), weights['output'])
    return output

weights = {
    'h1': tf.Variable(tf.random_normal([N_INPUT, hidden_size], stddev=0.1)),
    'h2': tf.Variable(tf.random_normal([hidden_size, hidden_size], stddev=0.1)),
    'output': tf.Variable(tf.random_normal([hidden_size, N_CLASSES], stddev=0.1))
}

pred = rbf_network(x_data, weights)

cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred, labels=y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)

# Training loop
for epoch in range(training_epochs):
    avg_cost = 0.
    total_batch = int(data_train.shape[0] / batch_size)
    for i in range(total_batch):
        randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
        batch_xs = data_train[randidx, :]
        batch_ys = target_train[randidx, :]

        sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
        avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch

        if epoch % display_step == 0:
            print("Epoch: %03d/%03d cost: %.9f" % (epoch, training_epochs, avg_cost))
            train_accuracy = sess.run(accuracy, feed_dict={x_data: batch_xs, y_target: batch_ys})
            print("Training accuracy: %.3f" % train_accuracy)

test_acc = sess.run(accuracy, feed_dict={x_data: data_test, y_target: target_test})
print("Test accuracy: %.3f" % (test_acc))

sess.close()

3条回答
一纸荒年 Trace。
2楼-- · 2020-06-04 16:19

While this implementation will do the job, I don't think its the most optimal RBF implementation. You are using a fixed size of 200 centroids (hidden units) in your RBF. This causes the centroids to not be optimally placed and the width of your Gaussian basis function to not be optimally sized. Typically the centroids should be learned in an unsupervised pre-stage by using K Means or any other kind of clustering algorithm.

So your 1st training stage would involve finding the centroids/centers of the RBFs, and the 2nd stage would be the actual classification/regression using the RBF Network

查看更多
够拽才男人
3楼-- · 2020-06-04 16:24

As it has been said, you should have N_Input = data.shape[1].

Actually data.shape[0] relates the number of realisations you have in your data-set and data.shape[1] tells us how many features the network should consider.

The number of features is by definition the size of the input layer regardless how many data you will propose (via feed_dict) to your network.

Plus boston dataset is a regression problem while softmax_cross_entropy is a cost function for classification problem. You can try tf.square to evaluate the euclidean distance between what you are predicting and what you want :

cost = tf.reduce_mean(tf.square(pred - y_target))

You will see that your network is learning, even though the accuracy is not very high.

Edit :

Your code is actually learning well but you used the wrong tool to measure it.

Mainly, your errors still reside in the fact that you are dealing with regression problem not with a classification problem.

In classification problem you can evaluate the accuracy of your on-going learning process using

correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

It consists in checking if the predicted class is the same as the expected class, for an input among x_test.

In regression problem, doing so is senseless since you are looking for a real number i.e. an infinity of possibility from the classification point of view.

In regression problem you can estimate the error (mean or whatever) between predicted values and expected values. We can use what I suggested below :

cost = tf.reduce_mean(tf.square(pred - y_target))

I modified your code consequently here it is

pred = rbf_network(x_data, weights)

cost = tf.reduce_mean(tf.square(pred - y_target))
my_opt = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)

#correct_prediction = tf.equal(tf.argmax(pred, 1), tf.argmax(y_target, 1))
#accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

init = tf.global_variables_initializer()
sess = tf.InteractiveSession()
sess.run(init)

plt.figure("Error evolution")
plt.xlabel("N_epoch")
plt.ylabel("Error evolution")
tol = 5e-4
epoch, err=0, 1
# Training loop
while epoch <= training_epochs and err >= tol:
    avg_cost = 0.
    total_batch = int(data_train.shape[0] / batch_size)
    for i in range(total_batch):
        randidx = np.random.randint(int(TRAIN_SIZE), size=batch_size)
        batch_xs = data_train[randidx, :]
        batch_ys = target_train[randidx, :]

        sess.run(my_opt, feed_dict={x_data: batch_xs, y_target: batch_ys})
        avg_cost += sess.run(cost, feed_dict={x_data: batch_xs, y_target: batch_ys})/total_batch
    plt.plot(epoch, avg_cost, marker='o', linestyle="none", c='k')
    plt.pause(0.05)
    err = avg_cost
    if epoch % 10 == 0:
        print("Epoch: {}/{} err = {}".format(epoch, training_epochs, avg_cost))

    epoch +=1

print ("End of learning process")
print ("Final epoch = {}/{} ".format(epoch, training_epochs))
print ("Final error = {}".format(err) )
sess.close()

The output is

Epoch: 0/400 err = 0.107879924503
Epoch: 10/400 err = 0.00520248359747
Epoch: 20/400 err = 0.000651647908274
End of learning process

Final epoch = 26/400 
Final error = 0.000474644409471

We plot the evolution of the error in the training through the different epochs enter image description here

查看更多
对你真心纯属浪费
4楼-- · 2020-06-04 16:35

I'm also new to Tensorflow and this is my first answer in stackoverflow. I tried your code and I got the same error.

You can see in the error code ValueError: Cannot feed value of shape (40, 13) for Tensor 'Placeholder:0', which has shape '(?, 12), that there is a mismatch in the shapes of the first placeholder:

x_data = tf.placeholder(shape=[None, N_INPUT], dtype=tf.float32)

so I'm not sure why the N_INPUT has a -1 in this line

N_INPUT = data.shape[1] - 1

I tried removing it and the code runs. Though it looks like the network isn't learning.

查看更多
登录 后发表回答