我想知道是否有人尝试在 tensorflow 中实现著名的 Levenberg-Marquardt 算法?在参数更新期间,我在尝试实现它时遇到了许多问题。以下代码片段显示了更新函数的实现:
def func_var_update(cost, parameters):
# compute gradients or Jacobians for cost with respect to parameters
dloss_dw = tf.gradients(cost, parameters)[0]
# Return dimension of gradient vector
dim, _ = dloss_dw.get_shape()
# Compute hessian matrix using results of gradients
hess = []
for i in range(dim):
# Compute gradient ot Jacobian matrix for loss function
dfx_i = tf.slice(dloss_dw, begin=[i,0] , size=[1,1])
ddfx_i = tf.gradients(dfx_i, parameters)[0]
# Get the actual tensors at the end of tf.gradients
hess.append(ddfx_i)
hess = tf.squeeze(hess)
dfw_new = tf.diag(dloss_dw)
# Update factor consisting of the hessian, product of identity matrix and Jacobian vector
JtJ = tf.linalg.inv(tf.ones((parameters.shape[0], parameters.shape[0])) + hess)
# product of gradient and damping parameter
pdt_JtJ = tf.matmul(JtJ, dloss_dw)
# Performing update here
new_params = tf.assign(parameters, parameters - pdt_JtJ)
return new_params
以及以下调用:
def mainfunc()
with tf.Session():
.....
vec_up = sess.run(func_var_update(), feed_dict=....)
导致以下错误:
InvalidArgumentError (see above for traceback): Input is not invertible.
但是当我在运行时打印它们时,雅可比/梯度和粗麻布的维度都可以。我遇到的另一个问题是无法在每次更新后跟踪参数,然后在将它们输入优化器之前使其适应个人需求。我想修复一些参数,并为其他人计算 hessian 和 jacobian,同时执行优化。任何帮助将不胜感激。