我想使用 tensorflow 生成文本并一直在修改 LSTM 教程(https://www.tensorflow.org/versions/master/tutorials/recurrent/index.html#recurrent-neural-networks)代码来做到这一点,但是我最初的解决方案似乎产生了废话,即使经过很长时间的训练,它也没有改善。我不明白为什么。这个想法是从零矩阵开始,然后一次生成一个单词。
这是代码,我在下面添加了两个函数 https://tensorflow.googlesource.com/tensorflow/+/master/tensorflow/models/rnn/ptb/ptb_word_lm.py
生成器如下所示
def generate_text(session,m,eval_op):
state = m.initial_state.eval()
x = np.zeros((m.batch_size,m.num_steps), dtype=np.int32)
output = str()
for i in xrange(m.batch_size):
for step in xrange(m.num_steps):
try:
# Run the batch
# targets have to bee set but m is the validation model, thus it should not train the neural network
cost, state, _, probabilities = session.run([m.cost, m.final_state, eval_op, m.probabilities],
{m.input_data: x, m.targets: x, m.initial_state: state})
# Sample a word-id and add it to the matrix and output
word_id = sample(probabilities[0,:])
output = output + " " + reader.word_from_id(word_id)
x[i][step] = word_id
except ValueError as e:
print("ValueError")
print(output)
我已将变量“概率”添加到 ptb_model 中,它只是 logits 上的 softmax。
self._probabilities = tf.nn.softmax(logits)
和抽样:
def sample(a, temperature=1.0):
# helper function to sample an index from a probability array
a = np.log(a) / temperature
a = np.exp(a) / np.sum(np.exp(a))
return np.argmax(np.random.multinomial(1, a, 1))