tensorflow - Tensorflow：使用训练和验证集的正确排队/批处理结构

Question

我正在尝试从最近的 2017 年开发峰会（代码在这里找到）复制 TensorBoard MNIST 示例中使用的结构。其中，feed_dict用于在训练集和验证集之间交替；但是，他们使用非常不透明的 mnist.train.next_batch，这使得您自己的迭代变得非常困难。

诚然，这也可能是因为我在努力理解 Tensorflow 中的队列实现，而显式示例似乎供不应求，尤其是对于 TF > v1.0。

根据我偶然发现的各种示例，我自己尝试了一个图像分类 CNN。最初，我通过将数据存储在预加载的变量（它是一个小数据集）中，让它只处理训练数据。我认为通过从文件名提供数据更容易让火车/有效交换工作，所以我尝试将其更改为。

在更改格式和尝试实现 feed_dict 训练/有效结构之间，我得到以下信息 -

错误：“您必须使用 dtype 字符串为占位符张量‘input/Placeholder_2’提供一个值”。

关于如何使其工作的任何提示或关于 slicer/train.batch/QueueRunner 如何实际协同工作的进一步解释都会有很大帮助，因为我发现 Tensorflow 教程在解释两者之间的基本工作流程方面缺乏他们。

我有一种感觉，我将 train.batch 放在了完全错误的位置，它可能应该在 feed_dict def 中，但不知道其他情况。谢谢！

import tensorflow as tf
from tensorflow.python.framework import dtypes

# Input - 216x216x1 images; ~900 training images, ~350 validation
# Want to do batches of 5 for training, 20 for validation

learn_rate = .0001
drop_keep = 0.9
train_batch = 5
test_batch = 20
epochs = 1
iterations = int((885/train_batch) * epochs)        

#
#
# A BUNCH OF (graph-building) HELPER DEFINITIONS EXCLUDED FOR BREVITY
#
#




#x_init will be fed a list of .jpg filenames (ex: [/file0.jpg, /file1.jpg, ...])
#y_init will be fed an array of one-hot classes (ex: [[0,1,0], [1,0,0], ...])

sess = tf.InteractiveSession()

with tf.name_scope('input'):
    batch_size = tf.placeholder(tf.int32)
    keep_prob = tf.placeholder(tf.float32)
    x_init = tf.placeholder(dtype=tf.string, shape=(None))
    y_init = tf.placeholder(dtype=np.int32, shape=(None,3)) #3 classes

    image, label = tf.train.slice_input_producer([x_init, y_init])
    file = tf.read_file(image)
    image = tf.image.decode_jpeg(file, channels=1)
    image = tf.cast(image, tf.float32)
    image.set_shape([216,216,1])
    label = tf.cast(label, tf.int32)
    images, labels = tf.train.batch([image, label], batch_size=batch_size)



conv1 = conv_layer(images, [5,5,1], 40, 'conv1')
#
#
# skip the rest of graph defining/functions (merged,train_step)
# very similar to what is found in the MNIST example.
#
#
tf.summary.scalar('accuracy', accuracy)
merged = tf.summary.merge_all()
train_writer = tf.summary.FileWriter(OUTPUT_LOC + '/train',sess.graph)
test_writer = tf.summary.FileWriter(OUTPUT_LOC + '/test')

sess.run(tf.global_variables_initializer())



#xTrain, yTrain, xTest, yTest are the train/valid images/labels lists
def feed_dict(train=True):
    if train:
        batch = train_batch
        keep = drop_keep
        xval = xTrain
        yval = yTrain
    else:
        batch = test_batch
        keep = 1
        xval = xTest
        yval = yTest
    return({x_init:xval, y_init:yval, batch_size:batch, keep_prob:keep})



#If I run "threads", I get the error. It works up until here.
coord = tf.train.Coordinator()
threads = tf.train.start_queue_runners(sess=sess,coord=coord)



#Don't know what works here or what doesn't.
for i in range(iterations):
    if i % 10 == 0:
        summary, acc = sess.run([merged, accuracy], feed_dict=feed_dict(False))
        test_writer.add_summary(summary, i)
        print('Accuracy at step %s: %s' % (i, acc))
    else:
        if i % 100 == 99:
            run_options = tf.RunOptions(trace_level=tf.RunOptions.FULL_TRACE)
            run_metadata = tf.RunMetadata()
            summary, _ = sess.run([merged, train_step], feed_dict=feed_dict(True), options=run_options, run_metadata=run_metadata)
            train_writer.add_run_metadata(run_metadata, 'step%03d' % i)
            train_writer.add_summary(summary, i)
            print('Adding run metadata for', i)
        else:  # Record a summary
            summary, _ = sess.run([merged, train_step],feed_dict=feed_dict(True))
            train_writer.add_summary(summary, i)
coord.request_stop()  
train_writer.close()
test_writer.close()
sess.close()

tensorflow - Tensorflow：使用训练和验证集的正确排队/批处理结构

0 回答 0

Related

Reference