image - 图像宽高对迁移学习模型精度的影响

Question

我有近 1000 张 4 类 1280x720 像素图像的人执行某些手势的图像。这个想法是使用迁移学习。

下面是使用 Inceptioon 的代码，目标图像大小为 640,360。

from keras.applications.inception_v3 import InceptionV3, preprocess_input
from keras.models import Model
from keras.layers import Dense, GlobalAveragePooling2D
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import SGD
import os
path = 'E:/build/set_1/training'
# Get count of number of files in this folder and all subfolders
def get_num_files(path):
  if not os.path.exists(path):
    return 0
  return sum([len(files) for r, d, files in os.walk(path)])

# Get count of number of subfolders directly below the folder in path
def get_num_subfolders(path):
  if not os.path.exists(path):
    return 0
  return sum([len(d) for r, d, files in os.walk(path)])
print(get_num_files(path))
print(get_num_subfolders(path))
def create_img_generator():
  return  ImageDataGenerator(
      preprocessing_function=preprocess_input,
      rotation_range=30,
      width_shift_range=0.2,
      height_shift_range=0.2,
      shear_range=0.2,
      zoom_range=0.2,
      horizontal_flip=True
  )
Image_width, Image_height = 640,360
Training_Epochs = 7
Batch_Size = 32
Number_FC_Neurons = 1024

train_dir = 'Desktop/Dataset/training'
validate_dir = 'Desktop/Dataset/validation'
num_train_samples = get_num_files(train_dir) 
num_classes = get_num_subfolders(train_dir)
num_validate_samples = get_num_files(validate_dir)
num_epoch = Training_Epochs
batch_size = Batch_Size
train_image_gen = create_img_generator()
test_image_gen = create_img_generator()

#   Connect the image generator to a folder contains the source images the image generator alters.  
#   Training image generator
train_generator = train_image_gen.flow_from_directory(
  train_dir,
  target_size=(Image_width, Image_height),
  batch_size=batch_size,
  seed = 42    #set seed for reproducability
)
validation_generator = test_image_gen.flow_from_directory(
  validate_dir,
  target_size=(Image_width, Image_height),
  batch_size=batch_size,
  seed=42       #set seed for reproducability
)
InceptionV3_base_model = InceptionV3(weights='imagenet', include_top=False) #include_top=False excludes final FC layer
print('Inception v3 base model without last FC loaded')
#print(InceptionV3_base_model.summary())     # display the Inception V3 model hierarchy

# Define the layers in the new classification prediction 
x = InceptionV3_base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(Number_FC_Neurons, activation='relu')(x)        # new FC layer, random init
predictions = Dense(num_classes, activation='softmax')(x)  # new softmax layer

# Define trainable model which links input from the Inception V3 base model to the new classification prediction layers
model = Model(inputs=InceptionV3_base_model.input, outputs=predictions)

# print model structure diagram
print (model.summary())
print ('\nPerforming Transfer Learning')
  #   Freeze all layers in the Inception V3 base model 
for layer in InceptionV3_base_model.layers:
  layer.trainable = False
#   Define model compile for basic Transfer Learning
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Fit the transfer learning model to the data from the generators.  
# By using generators we can ask continue to request sample images and the generators will pull images from 
# the training or validation folders and alter them slightly
history_transfer_learning = model.fit_generator(
  train_generator,
  epochs=num_epoch,
  steps_per_epoch = num_train_samples // batch_size,
  validation_data=validation_generator,
  validation_steps = num_validate_samples // batch_size)

# Save transfer learning model
model.save('inceptionv3-original-image-transfer-learning.model')

7 个 epoch 的准确率是 84%

如果目标图像大小为 200,113，则 7 个 epoch 的准确度为 86%

图像尺寸如何影响精度，以及应该使用什么尺寸的图像才能使该模型更准确。

score 0 · Accepted Answer

无论您使用何种框架，这些imagenet模型都在较小尺寸（224x224 ---> 299x299）上进行训练。

现在，确实对于对象检测和图像分割，原则上您可以从更高的分辨率中受益，因为可以更好地检测较小的对象。还有一些特定的架构可以通过更智能的功能重用来解决这个问题，但这不是问题的重点。

可能的情况是，当您的网络在较小的图像上进行训练并且您遇到分类问题时，通过增加分辨率实际上并没有改善结果。实际上，对于这个手势问题，网络可能会因为更高的分辨率而增加的特征集/复杂性而“更难”地学习手势。

如果您以较小的分辨率获得更好的结果，那不是问题；只需确保当您在测试集/现实生活中测试您的模型时，您需要保持图像的相同分布（现实生活中的图像需要与本地训练 + 验证 + 测试一样处于相同的统计分布中）。

事实是，您需要遍历几种分辨率组合并检查哪一种更适合您的情况；唯一要记住的是保持纵横比，以避免引入伪影/失真。

image - 图像宽高对迁移学习模型精度的影响

1 回答 1

Related

Reference