python - 使用批处理时没有注意到 feed_dict 的 TensorFlow 问题

Question

我一直在尝试使用 png 文件做 mnist 教程，并且已经将大多数事情都做到了有意义的地步。

代码的要点在这里，但是我将介绍它的作用以及问题发生的位置。

我有一个函数可以生成可以提供给 slice_input_producer 的文件名。

def gen_file_names_and_labels(rootDir):

"""goes through the directory structure and extracts images and labels from each image."""
file_names = []
labels = []
for file_name in glob.glob(rootDir+'/*/*'):

    file_type_removed = file_name.split('.')[0]
    split_by_dir = file_type_removed.split('/')
    file_names.append(file_name)
    labels.append(int(split_by_dir[2])) #getting the folder it's in, turning into an int, and using as label
return file_names, labels

这表现如预期。

在正文中，我运行此函数进行训练和测试，并将它们转换为张量，将这些张量传递给 slice_input_producer

sess = tf.InteractiveSession()

#THERE A PIPELINE FOR BOTH TESTING AND TRAINING. THEY COME IN PAIRS    
image_list_train,   label_list_train    = gen_file_names_and_labels('mnist_png/training')
image_list_test,    label_list_test     = gen_file_names_and_labels('mnist_png/testing')

images_train    = tf.convert_to_tensor(image_list_train,dtype=tf.string)    
images_test     = tf.convert_to_tensor(image_list_test,dtype=tf.string)    

#remember that these aren't the actual images, just file_names
labels_train    = tf.convert_to_tensor(label_list_train,dtype=tf.int32)
labels_test     = tf.convert_to_tensor(label_list_test,dtype=tf.int32)

input_queue_train   = tf.train.slice_input_producer([images_train   ,labels_train]  , shuffle=True)
input_queue_test    = tf.train.slice_input_producer([images_train   ,labels_train]  , shuffle=True)

这部分也可以正常工作。

这就是事情变得奇怪的地方。

asdf = tf.placeholder(tf.int32)
input_queue = tf.cond( asdf>0, lambda: input_queue_train, lambda: input_queue_test)
# input_queue = input_queue_test
image, label = read_images_from_disk(input_queue)
image_reshaped = tf.reshape( image, [28,28,1])
image_batch, label_batch = tf.train.batch([image_reshaped,label],batch_size=50)

变量 asdf 在愤怒中被重命名，因为它是坏消息的承载者。请参阅此处的计划是使用不同的队列进行培训和测试。我计划为一个单独的 int 提供 feed_dict ，它可以作为一个临时布尔值在两者之间切换。

coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(coord=coord)
sess.run(tf.initialize_all_variables())
print(label_batch.eval(feed_dict={asdf:0,keep_prob:1.0}))
for i in range(500):
    # batch = mnist.train.next_batch(50)

    if i%20 ==0:
        train_accuracy = accuracy.eval(feed_dict={keep_prob:1.0,asdf:0})
        print("step %d, training accuracy %g"%(i, train_accuracy))

    train_step.run(feed_dict={keep_prob:0.9,asdf:0})

但是，在运行它时，我收到错误消息：“您必须使用 dtype int32 为占位符张量 'Placeholder' 提供一个值”，这很奇怪，因为我正在提供它。

使用“print(foo.eval(feed_dict={asdf:0,keep_prob:1.0)) 我能够注意到一些有趣的现象。当我评估声明为“图像，标签”的单个变量时，切换似乎工作正常从“read_images_from_disk(input_queue)”中出来

但是，如果我尝试评估紧随其后的批处理，我会收到上述错误。

为了实现这一点，我在批处理方面做错了什么？有没有更好的方法在测试集和训练集之间进行切换？宇宙万物的生命意义何在？我指望你 StackOverflow。你是我唯一的希望。

score 1 · Accepted Answer

在回答您的问题时，“有没有更好的方法可以在测试集和训练集之间切换？”，是的。tf.cond()在每一步评估这两个函数（见这里），因此不必要地访问两个队列。这个 SO 讨论和相关链接提供了几个更好的选择：

用于tf.placeholder_with_default()您的测试数据
采用make_template

python - 使用批处理时没有注意到 feed_dict 的 TensorFlow 问题

1 回答 1

Related

Reference