我已镜像分布策略定义如下:
# create an instance of ImageDataGenerator
gen_train = ImageDataGenerator(
rotation_range=20,
width_shift_range=0.1,
height_shift_range=0.1,
shear_range=0.1,
zoom_range=0.2,
horizontal_flip=True,
preprocessing_function=preprocess_input
)
gen_test = ImageDataGenerator(
preprocessing_function=preprocess_input
)
batch_size = 512
# create generators
train_generator = gen_train.flow_from_directory(
train_path,
shuffle=True,
target_size=IMAGE_SIZE,
batch_size=batch_size,
)
valid_generator = gen_test.flow_from_directory(
valid_path,
target_size=IMAGE_SIZE,
batch_size=batch_size,
)
strategy = tf.distribute.MirroredStrategy()
print(f'Number of devices: {strategy.num_replicas_in_sync}')
with strategy.scope():
ptm = PretrainedModel(
input_shape=IMAGE_SIZE + [3],
weights='imagenet',
include_top=False)
# freeze pretrained model weights
ptm.trainable = False
# map the data into feature vectors
# Keras image data generator returns classes one-hot encoded
K = len(folders) # number of classes
x = Flatten()(ptm.output)
x = Dense(K, activation='softmax')(x)
# create a model object
model = Model(inputs=ptm.input, outputs=x)
# view the structure of the model
model.summary()
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
# fit the model
r = model.fit(
train_generator,
validation_data=valid_generator,
epochs=10,
steps_per_epoch=int(np.ceil(len(image_files) / batch_size)),
validation_steps=int(np.ceil(len(valid_image_files) / batch_size)),
)
我观察到的是,
如果我不使用该策略,运行时不会发生太大变化。如果我有 4 个 GPU,我应该将批量大小增加 4 吗?
CPU 没有完全使用(GPU 有)。是否有充分利用 CPU 的分配策略?