0

我正在尝试建立一个CNN+LSTM+CTC单词识别模型。
最初我有一个图像,我正在转换使用 CNN文字图像提取的特征并构建一系列特征,我将其作为顺序数据传递给RNN.

以下是我将特征转换为顺序数据的方式
[[a1,b1,c1],[a2,b2,c2],[a3,b3,c3]] -> [[a1,a2,a3],[b1,b2,b3],[c1,c2,c3]]
a,b,c使用CNN.
目前我可以将常量传递batch_size给模型common.BATCH_SIZE,但我想要的是能够将变量传递batch_size给模型。
如何才能做到这一点 ?

inputs = tf.placeholder(tf.float32, [common.BATCH_SIZE, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)

# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [common.BATCH_SIZE])

model = tf.layers.conv2d(inputs, 64, (3,3),strides=(1, 1), padding='same', name='c1')
model = tf.layers.max_pooling2d(model, (3,3), strides=(2,2), padding='same', name='m1')
model = tf.layers.conv2d(model, 128,(3,3), strides=(1, 1), padding='same', name='c2')
model = tf.layers.max_pooling2d(model, (3,3),strides=(2,2), padding='same', name='m2')
model = tf.transpose(model, [3,0,1,2])
shape = model.get_shape().as_list()
model = tf.reshape(model, [shape[0],-1,shape[2]*shape[3]])

cell = tf.nn.rnn_cell.LSTMCell(common.num_hidden, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, input_keep_prob=0.5, output_keep_prob=0.5)
stack = tf.nn.rnn_cell.MultiRNNCell([cell]*common.num_layers, state_is_tuple=True)

outputs, _ = tf.nn.dynamic_rnn(cell, model, seq_len, dtype=tf.float32,time_major=True)



更新:

batch_size = tf.placeholder(tf.int32, None, name='batch_size')

inputs = tf.placeholder(tf.float32, [batch_size, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
# Here we use sparse_placeholder that will generate a
# SparseTensor required by ctc_loss op.
targets = tf.sparse_placeholder(tf.int32)

# 1d array of size [batch_size]
seq_len = tf.placeholder(tf.int32, [batch_size])

model = tf.layers.conv2d(inputs, 64, (3,3),strides=(1, 1), padding='same', name='c1')
model = tf.layers.max_pooling2d(model, (3,3), strides=(2,2), padding='same', name='m1')
model = tf.layers.conv2d(model, 128,(3,3), strides=(1, 1), padding='same', name='c2')
model = tf.layers.max_pooling2d(model, (3,3),strides=(2,2), padding='same', name='m2')
model = tf.transpose(model, [3,0,1,2])
shape = model.get_shape().as_list()
model = tf.reshape(model, [shape[0],-1,shape[2]*shape[3]])

cell = tf.nn.rnn_cell.LSTMCell(common.num_hidden, state_is_tuple=True)
cell = tf.nn.rnn_cell.DropoutWrapper(cell, input_keep_prob=0.5, output_keep_prob=0.5)
stack = tf.nn.rnn_cell.MultiRNNCell([cell]*common.num_layers, state_is_tuple=True)

outputs, _ = tf.nn.dynamic_rnn(cell, model, seq_len, dtype=tf.float32,time_major=True)


我收到如下错误:

    Traceback (most recent call last):
  File "lstm_and_ctc_ocr_train.py", line 203, in <module>
    train()
  File "lstm_and_ctc_ocr_train.py", line 77, in train
    logits, inputs, targets, seq_len, batch_size = model.get_train_model()
  File "/home/himanshu/learning-tf/tf/code/tensorflow_lstm_ctc_ocr/model.py", line 20, in get_train_model
inputs = tf.placeholder(tf.float32, [batch_size, common.OUTPUT_SHAPE[1], common.OUTPUT_SHAPE[0], 1])
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/array_ops.py", line 1530, in placeholder
    return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/ops/gen_array_ops.py", line 1954, in _placeholder
    name=name)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 705, in apply_op
    attr_value.shape.CopyFrom(_MakeShape(value, key))
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/op_def_library.py", line 198, in _MakeShape
    return tensor_shape.as_shape(v).as_proto()
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 798, in as_shape
    return TensorShape(shape)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 434, in __init__
    self._dims = [as_dimension(d) for d in dims_iter]
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 376, in as_dimension
    return Dimension(value)
  File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/tensor_shape.py", line 32, in __init__
    self._value = int(value)
TypeError: int() argument must be a string or a number, not 'Tensor'
4

1 回答 1

0

您应该能够batch_size作为占位符传递给动态 RNN。根据我的经验,您可能会遇到的唯一令人头疼的问题是,如果您没有提前指定它的形状,那么您应该通过 [] 来使事情正常工作,如下所示:

batchsize = tf.placeholder(tf.int32, [], name='batchsize')

然后以通常的方式在 sess.run() 期间输入它的值。这对我来说在大批量训练时效果很好,但随后以 1 的批量生成。

但严格来说,您甚至不需要dynamic_rnn专门指定批量大小,对吗?如果你MultiRNNCell用来获得零状态,你需要它,但我没有看到你在你的代码中这样做......

*** 更新:

正如评论中所讨论的,您的问题似乎与您使用占位符来指定另一个占位符的形状这一事实无关,dynamic_rnn而更多地与这一事实有关。这是重现相同错误的代码:inputsseq_len

import tensorflow as tf

a = tf.placeholder(tf.int32, None, name='a')
b = tf.placeholder(tf.int32, [a, 5], name='b')
c = b * 5

with tf.Session() as sess:
    C = sess.run(c, feed_dict={a:1, b:[[1,2,3,4,5]]})

这是错误:

TypeError: int() argument must be a string, a bytes-like object or a number, not 'Tensor'

在让dynamic_rnn.

于 2017-07-10T13:29:02.157 回答