当我尝试训练一个 batch_size 大于 1 的代理时,它给了我一个例外。我的问题在哪里?
lr = 1e-3
window_length = 1
emb_size = 10
look_back = 6
# "Expert" (regular dqn) model architecture
inp = Input(shape=(look_back,))
emb = Embedding(input_dim=env.action_space.n+1, output_dim = emb_size)(inp)
rnn = Bidirectional(LSTM(5))(emb)
out = Dense(env.action_space.n, activation='softmax')(rnn)
expert_model = Model(inputs = inp, outputs = out)
expert_model.compile(loss='categorical_crossentropy', optimizer= Adam(lr))
print(expert_model.summary())
# memory
memory = PrioritizedMemory(limit=1000000, window_length=window_length)
# policy
policy = BoltzmannQPolicy()
# agent
dqn = DQNAgent(model=expert_model, nb_actions=env.action_space.n, policy=policy, memory=memory,
enable_double_dqn=False, enable_dueling_network=False, gamma=.9, batch_size = 100, #Here
target_model_update=1e-2, processor = RecoProcessor())
我直接从 keras-rl 的代码打印一些值,它给了我这个输出:
State[array([0., 0., 0., 0., 0., 0.])]
Batch: [[[0. 0. 0. 0. 0. 0.]]]
但也有这个例外:
ValueError: Error when checking input: expected input_1 to have 2 dimensions, but got array with shape (1, 1, 6)
我可以把处理器类的代码放上来,我认为这就是关键,但首先我要确保这里没有任何问题。