0

我正在尝试在 tensorflow 中创建一个 CNN 模型,但我无法理解一个问题:

我的模型是一个标准模型 3d 模型,构建在一个名为 convolutional_neural_network 的函数中。

配置为:

    self.prediction = self.convolutional_neural_network(self.x, img_sz, n_slices)
    self.cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=self.y_,
                                                                  logits=self.prediction))
    optimizer = tf.train.AdamOptimizer(self.learning_rate).minimize(self.cost)
    correct_prediction = tf.equal(tf.argmax(self.prediction, 1), tf.argmax(self.y_, 1))
    self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

其中 self.y_ 是放置标签的占位符。

在某些时候,我正在尝试执行那段代码:

            feed_dict = {self.x: batch_img, self.y_: batch_label, self.keep_prob:
                    keep_rate, self.learning_rate: learning_rate}

            validation_cost = self.cost.eval(
                feed_dict = feed_dict )
            print "Validation cost: "+str(validation_cost)

            validation_cost = self.cost.eval(
                feed_dict = feed_dict)
            print "Validation cost: "+str(validation_cost)

其中 self.keep_prob: 是模型的退出

它应该给出 sam 结果,但它没有:它输出:

验证成本:1.05199,

验证成本:0.115607

谁能给我一些关于正在发生的事情的想法?

谢谢

ps:我已经删除了“优化器”行,但它输出不同的结果

非常感谢

添加:

我的模型如下:

def convolutional_neural_network(self, x, img_sz, n_slices):
    weights = {
        'W_conv1': self.weight_variable([6, 8, 8, 1, 32]),
        'W_conv2': self.weight_variable([2, 5, 5, 32, 48]),
        'W_fc': self.weight_variable([int(
            math.ceil(n_slices / 4) * (math.ceil(img_sz / 4) * math.ceil(img_sz / 4) *
                                        80)),
                                      512]),
        'W_fc2': self.weight_variable([512, 256]),
        'out': self.weight_variable([256, 2])
    }

    biases = {
        'b_conv1': self.bias_variable([32]),
        'b_conv2': self.bias_variable([48]),
        'b_fc': self.bias_variable([512]),
        'b_fc2': self.bias_variable([256]),
        'out': self.bias_variable([2])
    }

    self.x_im = tf.reshape(x, shape=[-1, n_slices, img_sz, img_sz, 1])

    conv1 = tf.tanh(self.conv3d(self.x_im, weights['W_conv1']) + biases['b_conv1'])
    conv1 =self.maxpool3d(conv1)

    conv2 = tf.tanh(self.conv3d(conv1, weights['W_conv2']) + biases['b_conv2'])
    conv2 = self.maxpool3d(conv2)


    fc = tf.reshape(conv4, [-1,int(math.ceil(n_slices/8)*math.ceil(img_sz/8)*math.ceil(
        img_sz/8))*80])
    fc = tf.tanh(tf.matmul(fc, weights['W_fc'])+biases['b_fc'])
    fc = tf.tanh(tf.matmul(fc, weights['W_fc2'])+biases['b_fc2'])
    fc = tf.nn.dropout(fc, self.keep_prob)

    output = tf.matmul(fc, weights['out'])+biases['out']
    return output

其中 x = tf.placeholder

还要提到 batch_img 和 batch_label 是 numpy 数组。

和:

def weight_variable(self, shape):
    initial = tf.truncated_normal(shape, stddev=0.1)
    return tf.Variable(initial)

def bias_variable(self, shape):
    initial = tf.constant(0.1, shape=shape)
    return tf.Variable(initial)

以及训练过程:

def train_neural_network(self, data_img, labels,
                         batch_size, img_sz, n_slices, last_batch,
                         keep_rate, model_path):

    self.prediction = self.convolutional_neural_network(self.x, img_sz, n_slices)
    self.cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=self.y_,
                                                         logits=self.prediction))
    optimizer = tf.train.AdamOptimizer(self.learning_rate).minimize(self.cost)
    correct_prediction = tf.equal(tf.argmax(self.prediction, 1), tf.argmax(self.y_, 1))
    self.accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

    hm_epochs = 1000
    saver = tf.train.Saver(tf.trainable_variables())
    n_epoch = 0
    learning_rate = 1e-4
    model_path_train = 'model_train_3/my_model.ckpt'

    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        if model_path:
            saver.restore(sess, model_path_train)
        while n_epoch < hm_epochs:
            if len(data_img)>last_batch+batch_size:
                with tf.device('/cpu:0'):
                    #batch_img, batch_label, last_batch = self.get_image(
                    #    data_img, labels, last_batch, batch_size, img_sz, n_slices
                    #)
                    batch_img, batch_label, last_batch = self.wrapper_image(data_img, labels, last_batch, batch_size)

                print "Batch label images: "+str(batch_label)
                batch_label = self.dense_to_one_hot(np.array(batch_label, dtype=np.int),
                                                    2).astype(np.float32)


                ####### at the end of EACH EPOCH ###
                n_epoch += 1
                print "n_epoch: "+str(n_epoch)

                _, c = sess.run(
                    [optimizer, self.cost], feed_dict={
                        self.x: batch_img, self.y_: batch_label, self.keep_prob: keep_rate,
                        self.learning_rate: learning_rate
                    }
                )


                c = self.cost.eval(feed_dict = {self.x: batch_img, self.y_: batch_label,
                                             self.keep_prob:
                        keep_rate, self.learning_rate: learning_rate})

                print "train cost: "+str(c)

                c = self.cost.eval(feed_dict = {self.x: batch_img, self.y_: batch_label,
                                             self.keep_prob:
                        keep_rate, self.learning_rate: learning_rate})

                print "train cost: "+str(c)
                if model_path:
                    saver.save(sess, model_path_train)
4

2 回答 2

0

大多数情况下,这是由于没有正确保存和恢复变量,或者如果您使用批量规范化,那么就是数据。其结果是,未恢复的变量在每次运行时都会使用随机权重进行初始化,每次都会给您不同的结果。

作为一个快速测试tf.set_random_seed( 1 ),在所有导入之后插入 a ,看看这是否修复了不同运行的值。如果是,那么我的理论更有可能是有效的。然后尝试打印每个层的激活的平均值和标准偏差,你会看到两个运行开始分歧的地方。

于 2018-04-14T00:48:27.057 回答
0

已修复,我刚刚意识到辍学保持概率。它被设置为 0.5,然后模型在每次成本运行中激活模型的不同网络。如果我设置 keep prob =1.0,则结果在两次连续运行中输出相同。感谢彼得的帮助

于 2018-04-18T12:19:25.357 回答