我试图在 tensorFlow 中重现本出版物中提出的网络架构。作为一个完全的初学者,我一直使用本教程作为基础,使用 tensorflow==2.3.2。
为了训练这个网络,他们使用了一个损失,这意味着同时来自网络的两个分支的输出,这让我开始关注 keras 中的自定义损失函数。我知道你可以定义你自己的,只要函数的定义如下所示:
def custom_loss(y_true, y_pred):
我也明白你可以像这样给出其他论点:
def loss_function(margin=0.3):
def custom_loss(y_true, y_pred):
# And now you can use margin
然后,您只需在编译模型时调用它们。在使用多个输出时,最常见的方法似乎是这里提出的一种方法,您可以在其中提供多个损失函数,为每个输出调用一个。但是,我找不到为损失函数提供多个输出的解决方案,而这正是我所需要的。
为了进一步解释它,这里是一个显示我尝试过的最小工作示例,您可以在此 collab中自己尝试。
import os
import tensorflow as tf
import keras.backend as K
from tensorflow.keras import datasets, layers, models, applications, losses
from tensorflow.keras.preprocessing import image_dataset_from_directory
_URL = 'https://storage.googleapis.com/mledu-datasets/cats_and_dogs_filtered.zip'
path_to_zip = tf.keras.utils.get_file('cats_and_dogs.zip', origin=_URL, extract=True)
PATH = os.path.join(os.path.dirname(path_to_zip), 'cats_and_dogs_filtered')
train_dir = os.path.join(PATH, 'train')
validation_dir = os.path.join(PATH, 'validation')
BATCH_SIZE = 32
IMG_SIZE = (160, 160)
IMG_SHAPE = IMG_SIZE + (3,)
train_dataset = image_dataset_from_directory(train_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE)
validation_dataset = image_dataset_from_directory(validation_dir,
shuffle=True,
batch_size=BATCH_SIZE,
image_size=IMG_SIZE)
data_augmentation = tf.keras.Sequential([
layers.experimental.preprocessing.RandomFlip('horizontal'),
layers.experimental.preprocessing.RandomRotation(0.2),
])
preprocess_input = applications.resnet50.preprocess_input
base_model = applications.ResNet50(input_shape=IMG_SHAPE,
include_top=False,
weights='imagenet')
base_model.trainable = True
conv = layers.Conv2D(filters=128, kernel_size=(1,1))
global_pooling = layers.GlobalAveragePooling2D()
horizontal_pooling = layers.AveragePooling2D(pool_size=(1, 5))
reshape = layers.Reshape((-1, 128))
def custom_loss(y_true, y_pred):
print(y_pred.shape)
# Do some stuffs involving both outputs
# Returning something trivial here for correct behavior
return K.mean(y_pred)
inputs = tf.keras.Input(shape=IMG_SHAPE)
x = data_augmentation(inputs)
x = preprocess_input(x)
x = base_model(x, training=True)
first_branch = global_pooling(x)
second_branch = conv(x)
second_branch = horizontal_pooling(second_branch)
second_branch = reshape(second_branch)
model = tf.keras.Model(inputs, [first_branch, second_branch])
base_learning_rate = 0.0001
model.compile(optimizer=tf.keras.optimizers.Adam(lr=base_learning_rate),
loss=custom_loss,
metrics=['accuracy'])
model.summary()
initial_epochs = 10
history = model.fit(train_dataset,
epochs=initial_epochs,
validation_data=validation_dataset)
这样做时,我认为赋予损失函数的 y_pred 将是一个列表,包含两个输出。但是,在运行它时,我在标准输出中得到的是:
Epoch 1/10
(None, 2048)
(None, 5, 128)
我从中了解到的是,每个输出都会一个一个地调用损失函数,而不是对所有输出调用一次,这意味着我无法定义同时使用两个输出的损失。有什么办法可以做到这一点?
如果我不清楚,或者您需要更多详细信息,请告诉我。