我想定义我自己的定制调节器,我正在使用 GradientTape。我正在使用以下代码,但是无论我选择多大的调整参数,结果总是保持不变。有人知道我怎样才能让我的定制调节器工作吗?
我的模型:
inputs = layers.Input(shape=(state_dim,))
hidden1 = layers.Dense(units = 40, activation = keras.layers.LeakyReLU(alpha=0.5),
kernel_regularizer = sparse_reg,
kernel_initializer = keras.initializers.HeUniform(seed = seed),
bias_initializer = keras.initializers.Zeros())(inputs)
hidden2 = layers.Dense(units = 15, activation = keras.layers.LeakyReLU(alpha=0.5),
kernel_initializer = keras.initializers.HeUniform(seed = seed),
bias_initializer = keras.initializers.Zeros())(hidden1)
q_values = layers.Dense(units = action_dim,
activation="linear",
kernel_initializer = keras.initializers.HeUniform(seed = seed))(hidden2)
deep_q_network = keras.Model(inputs=inputs, outputs=q_values)
我的定制调节器:
def sparse_reg(weight_matrix):
cumWeightPerInput = np.sum(np.abs(weight_matrix), axis=1)
penalty = tf.reduce_sum(np.sqrt(cumWeightPerInput))
return 0.01 * penalty
我的训练过程:
with tf.GradientTape() as tape:
currentQvalues = mainNetwork(S, training = True)
loss_value = self.lossFunction(targetQvalues, currentQvalues)
loss_regularization = tf.math.add_n(mainNetwork.losses)
loss_value = loss_value + loss_regularization
grads = tape.gradient(loss_value, mainNetwork.trainable_variables)
opt.apply_gradients(zip(grads, mainNetwork.trainable_variables))