我正在尝试使用 Q 学习来训练 CatPole-v0。尝试使用经验更新重播缓冲区时,出现以下错误:
ValueError: Cannot feed value of shape (128,) for Tensor 'Placeholder_1:0', which has shape '(?, 2)'
相关的代码片段是:
def update_replay_buffer(replay_buffer, state, action, reward, next_state, done, action_dim):
# append to buffer
experience = (state, action, reward, next_state, done)
replay_buffer.append(experience)
# Ensure replay_buffer doesn't grow larger than REPLAY_SIZE
if len(replay_buffer) > REPLAY_SIZE:
replay_buffer.pop(0)
return None
要馈送的占位符是
action_in = tf.placeholder("float", [None, action_dim])
有人可以澄清应该如何使用 action_dim 来解决这个错误吗?