python - 如何访问在 tf.function 中更新的张量值（例如指标）？

Question

我一直在研究一个模型，其训练循环使用tf.function包装器（急切运行时出现 OOM 错误），并且训练似乎运行良好。但是，我无法访问自定义训练函数返回的张量值（如下）

def train_step(inputs, target):
    with tf.GradientTape() as tape:
        predictions = model(inputs, training=True)
        curr_loss = lovasz_softmax_flat(predictions, target)

    gradients = tape.gradient(curr_loss, model.trainable_variables)
    opt.apply_gradients(zip(gradients, model.trainable_variables))
    
    # Need to access this value
    return curr_loss

我的“伞”训练循环的简化版本如下：

@tf.function
def train_loop():
for epoch in range(EPOCHS):
        for tr_file in train_files:

            tr_inputs = preprocess(tr_file)
            
            tr_loss = train_step(tr_inputs, target)
            print(tr_loss.numpy())

当我尝试打印损失值时，最终出现以下错误：

AttributeError：“张量”对象没有属性“numpy”

我也尝试使用 tf.print() 如下：

tf.print("Loss: ", tr_loss, output_stream=sys.stdout)

但是终端上似乎什么也没有出现。有什么建议么？

score 0 · Accepted Answer

您无法在图形模式下转换为 Numpy 数组。只需在函数之外创建一个tf.metrics对象，然后在函数中更新它。

mean_loss_values = tf.metrics.Mean()

def train_step(inputs, target):
    with tf.GradientTape() as tape:
        predictions = model(inputs, training=True)
        curr_loss = lovasz_softmax_flat(predictions, target)

    gradients = tape.gradient(curr_loss, model.trainable_variables)
    opt.apply_gradients(zip(gradients, model.trainable_variables))

    # look below
    mean_loss_values(curr_loss)
    # or mean_loss_values.update_state(curr_loss)
    
    # Need to access this value
    return curr_loss

然后在您的代码中：

mean_loss_values.result()

python - 如何访问在 tf.function 中更新的张量值（例如指标）？

1 回答 1

Related

Reference