0

我已镜像分布策略定义如下:

# create an instance of ImageDataGenerator
gen_train = ImageDataGenerator(
  rotation_range=20,
  width_shift_range=0.1,
  height_shift_range=0.1,
  shear_range=0.1,
  zoom_range=0.2,
  horizontal_flip=True,
  preprocessing_function=preprocess_input
)

gen_test = ImageDataGenerator(
  preprocessing_function=preprocess_input
)

batch_size = 512

# create generators
train_generator = gen_train.flow_from_directory(
  train_path,
  shuffle=True,
  target_size=IMAGE_SIZE,
  batch_size=batch_size,
)
valid_generator = gen_test.flow_from_directory(
  valid_path,
  target_size=IMAGE_SIZE,
  batch_size=batch_size,
)

strategy = tf.distribute.MirroredStrategy()
print(f'Number of devices: {strategy.num_replicas_in_sync}')
with strategy.scope():
    ptm = PretrainedModel(
        input_shape=IMAGE_SIZE + [3],
        weights='imagenet',
        include_top=False)

    # freeze pretrained model weights
    ptm.trainable = False

    # map the data into feature vectors

    # Keras image data generator returns classes one-hot encoded

    K = len(folders) # number of classes
    x = Flatten()(ptm.output)
    x = Dense(K, activation='softmax')(x)

    # create a model object
    model = Model(inputs=ptm.input, outputs=x)

    # view the structure of the model
    model.summary()

    model.compile(
      loss='categorical_crossentropy',
      optimizer='adam',
      metrics=['accuracy']
    )

# fit the model
r = model.fit(
  train_generator,
  validation_data=valid_generator,
  epochs=10,
  steps_per_epoch=int(np.ceil(len(image_files) / batch_size)),
  validation_steps=int(np.ceil(len(valid_image_files) / batch_size)),
)

我观察到的是,

  1. 如果我不使用该策略,运​​行时不会发生太大变化。如果我有 4 个 GPU,我应该将批量大小增加 4 吗?

  2. CPU 没有完全使用(GPU 有)。是否有充分利用 CPU 的分配策略?

4

1 回答 1

0
  1. 批处理大小是全局批处理大小,因此您必须将单个批处理大小乘以每 GPU 数量

  2. 为了更好地利用资源,您应该使用 tf.data 并阅读性能指南:https ://www.tensorflow.org/guide/data_performance 。您还应该阅读以下内容:https ://www.tensorflow.org/guide/distributed_training

于 2020-03-08T11:10:43.020 回答