python - 对 LSTM 模型的 Jacobian 使用梯度磁带 - Python

Question

我正在使用 LSTM 构建一个模型预测的序列。我的数据有 4 个输入变量和 1 个需要预测的输出变量。数据是时间序列数据。数据的总长度为 38265（总时间步数）。总数据在大小为 38265 *5 的数据框中

我想使用 4 个输入变量的前 20 个时间步长数据来预测我的输出变量。我为此目的使用下面的代码。

model = Sequential()

model.add(LSTM(units = 120, activation ='relu', return_sequences = False,input_shape = 
(train_in.shape[1],5)))
model.add(Dense(100,activation='relu'))
model.add(Dense(50,activation='relu'))

model.add(Dense(1))

我想使用 tf.Gradient Tape 计算 LSTM 模型函数的输出变量的雅可比 .. 谁能帮我解决这个问题？

score 1 · Accepted Answer

将输出的雅可比相对于 LSTM 输入分离的解决方案可以如下完成：

使用tf.GradientTape()，我们可以计算梯度流产生的雅可比行列式。
然而，为了获得 Jacobian ，输入需要采用 tf.EagerTensor 的形式，当我们想要查看输出的 Jacobian 时（在执行 y=model(x) 之后），通常可以使用它。下面的代码片段分享了这个想法：

#Get the Jacobian for each persistent gradient evaluation
model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu'))
model.add(tf.keras.layers.Dense(2,activation='relu'))
x = tf.constant([[5., 6., 3.]])

with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
  # Forward pass
  tape.watch(x)
  y = model(x)
  loss = tf.reduce_mean(y**2)
print('Gradients\n')
jacobian_wrt_loss=tape.jacobian(loss,x)
print(f'{jacobian_wrt_loss}\n')
jacobian_wrt_y=tape.jacobian(y,x)
print(f'{jacobian_wrt_y}\n')

但是为了获得中间输出，例如在这种情况下，已经有很多使用Keras的样本。当我们分离来自 model.layers.output 的输出时，我们得到的类型是 Keras.Tensor 而不是 EagerTensor。然而，为了创建雅可比行列式，我们需要 Eager Tensor。（在多次尝试使用 @tf.function 包装失败后，因为 TF>2.0 中已经存在急切执行）
因此，或者，可以使用所需的层创建一个辅助模型（在这种情况下，只有输入和 LSTM 层）。这个模型的输出将是一个 tf.EagerTensor，这对于雅可比张量的创建很有用。此代码段中显示了以下内容：

#General Syntax for getting jacobians for each layer output
import numpy as np
import tensorflow as tf
tf.executing_eagerly()
x=tf.constant([[15., 60., 32.]])
x_inp = tf.keras.layers.Input(tensor=tf.constant([[15., 60., 32.]]))
model=tf.keras.Sequential()
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_2'))

aux_model=tf.keras.Sequential()
aux_model.add(tf.keras.layers.Dense(2,activation='relu',name='dense_1'))
#model.compile(loss='sparse_categorical_crossentropy',optimizer='adam',metrics=['accuracy'])

with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:
  # Forward pass
  tape.watch(x)
  x_y = model(x)
  act_y=aux_model(x)
  print(x_y,type(x_y))
  ops=[layer.output for layer in model.layers]
    
# ops=[layer.output for layer in model.layers]
# inps=[layer.input for layer in model.layers]
print('Jacobian of Full FFNN\n')
jacobian=tape.jacobian(x_y,x)
print(f'{jacobian[0]}\n')

print('Jacobian of FFNN with Just first Dense\n')
jacobian=tape.jacobian(act_y,x)
print(f'{jacobian[0]}\n')

在这里，我使用了一个由 2 个 Dense 层组成的简单 FFNN，但我想评估第一个 Dense 层的输出。因此，我创建了一个只有 1 个 Dense 层的辅助模型，并从中确定了雅可比行列式的输出。

详细信息可以在这里找到。

score 1 · Accepted Answer

在@Abhilash Majumder 的帮助下，我做到了这一点。我将其发布在这里，以便将来可能对某人有所帮助。import numpy as np import pandas as pd import tensorflow as tf tf.compat.v1.enable_eager_execution() #这将启用必须的急切执行。

tf.executing_eagerly() #check if eager execution is enabled or not. Should give "True"

data = pd.read_excel("FileName or Location ")
#My data is in the from of dataframe with 127549 rows and 5 columns(127549*5)

a = data[:20]  #shape is (20,5)
b = data[50:70] # shape is (20,5)
A = [a,b]  # making a list
A = np.array(A) # convert into array size (2,20,5) 

At = tf.convert_to_tensor(A, np.float32) #convert into tensor
At.shape # TensorShape([Dimension(2), Dimension(20), Dimension(5)])

model = load_model('EKF-LSTM-1.h5') # Load the trained model
# I have a trained model which is shown in the question above. 
# Output of this model is a single value

with tf.GradientTape(persistent=True,watch_accessed_variables=True) as tape:

tape.watch(At)
y1 = model(At) #defining your output as a function of input variables
print(y1,type(y1)

#output 
tf.Tensor([[0.04251503],[0.04634088]], shape=(2, 1), dtype=float32) <class 
'tensorflow.python.framework.ops.EagerTensor'>

jacobian=tape.jacobian(y1,At) #jacobian of output w.r.t both inputs
jacobian.shape

输出

TensorShape([Dimension(2), Dimension(1), Dimension(2), Dimension(20), Dimension(5)])

在这里，我计算了 Jacobian wrt 2 个输入，每个输入的大小为 (20,5)。如果你想只计算一个大小为 (20,5) 的输入，那么使用这个

jacobian=tape.jacobian(y1,At[0]) #jacobian of output w.r.t only 1st input in 'At'
jacobian.shape

输出

TensorShape([Dimension(1), Dimension(1), Dimension(1), Dimension(20), Dimension(5)])

score 0 · Accepted Answer

对于那些希望在一系列相互独立的输入和输出上计算雅可比矩阵的人input[i]，output[j]请i != j考虑该batch_jacobian方法。

这会将计算出的雅可比张量中的维数减少一，并且可能是内存不足与否之间的差异。

请参阅：batch_jacobian在TensorFlow GradientTape 文档中。

python - 对 LSTM 模型的 Jacobian 使用梯度磁带 - Python

3 回答 3

输出

输出

Related

Reference