根据原始 RNN 的 Tensorflow 文档,您可以在初始时间步长中将 cell_output 设置为您自己的形状,此时 time=0 以发出您自己的输出。
目前,它仅在我们将初始(时间 = 0)单元输出设置为 None 并且发出的其余单元输出等于前一个单元输出时才有效,但这不是我们想要的。我们希望发出我们自己的输出,而不是 RNN 单元的内部状态。
下面是相关的源代码:
def get_loop_fn(self, encoder_final_state, eos_token, batch_size, WW, bb):
def loop_fn(time, previous_output, previous_state, previous_loop_state):
# inputs: time, previous_cell_output, previous_cell_state, previous_loop_state
# outputs: elements_finished, input, cell_state, output, loop_state
if previous_state is None: # time == 0
assert previous_output is None
return self.loop_fn_initial(encoder_final_state=encoder_final_state, eos_token=eos_token,
batch_size=batch_size)
else:
return self.loop_fn_transition(previous_cell_output=previous_output, previous_cell_state=previous_state,
batch_size=batch_size, WW=WW, bb=bb)
return loop_fn
def loop_fn_initial(self, encoder_final_state, eos_token, batch_size):
# https://www.tensorflow.org/api_docs/python/tf/nn/raw_rnn
# here we have a static rnn case so all length so be taken into account,
# so I guess they are all False always
# From documentation: a boolean Tensor of shape [batch_size]
initial_elements_finished = self.all_elems_non_finished(batch_size=batch_size)
initial_input = eos_token
initial_cell_state = encoder_final_state
# initial_cell_output = None
# give it the shape that we want:
initial_cell_output = tf.constant(np.zeros(shape=(batch_size, self.TARGET_FEATURE_LEN)), dtype=tf.float32)
initial_loop_state = None # we don't need to pass any additional information
return (initial_elements_finished,
initial_input,
initial_cell_state,
initial_cell_output,
initial_loop_state)
def loop_fn_transition(self, previous_cell_output, previous_cell_state, batch_size, WW, bb):
"""note that the matrix W is going to be shared among outputs"""
print "previous cell output"
print previous_cell_output
print
finished = self.all_elems_non_finished(batch_size=batch_size) # (time >= decoder_lengths)
# this operation produces boolean tensor of [batch_size] defining if corresponding sequence has ended
# this is always false in our case so just comment next two lines
# finished = tf.reduce_all(elements_finished) # -> boolean scalar
# input = tf.cond(finished, lambda: pad_step_embedded, get_next_input)
next_input = tf.add(tf.matmul(previous_cell_output, WW), bb)
next_cell_state = previous_cell_state
emit_output = next_input, # previous_cell_output
next_loop_state = None # we don't need to pass any additional information
return (finished,
next_input,
next_cell_state,
emit_output,
next_loop_state)
这是我们在尝试创建图表时收到的错误消息:
ValueError:这两个结构没有相同的嵌套结构。第一个结构: ,第二个结构: (,)。
欢迎任何想法