5

作为“健全性检查”,我尝试了两种使用迁移学习的方法,我希望它们表现相同,如果不是在运行时间上,至少在结果上。

第一种方法是使用瓶颈特征(如此处所述https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html),即使用现有的预测器在最后一个密集层之前生成特征,保存它们,然后以这些特征作为输入训练一个新的密集层。

第二种方法是用一个新的替换模型的最后一个密集层,然后冻结模型中的所有其他层。

我希望第二种方法与第一种方法一样有效,但事实并非如此。

第一种方法的输出是

 Epoch 1/50
16/16 [==============================] - 0s - loss: 1.3095 - acc: 0.4375 - val_loss: 0.4533 - val_acc: 0.7500
Epoch 2/50
16/16 [==============================] - 0s - loss: 0.3555 - acc: 0.8125 - val_loss: 0.2305 - val_acc: 1.0000
Epoch 3/50
16/16 [==============================] - 0s - loss: 0.1365 - acc: 1.0000 - val_loss: 0.1603 - val_acc: 1.0000
Epoch 4/50
16/16 [==============================] - 0s - loss: 0.0600 - acc: 1.0000 - val_loss: 0.1012 - val_acc: 1.0000
Epoch 5/50
16/16 [==============================] - 0s - loss: 0.0296 - acc: 1.0000 - val_loss: 0.0681 - val_acc: 1.0000
Epoch 6/50
16/16 [==============================] - 0s - loss: 0.0165 - acc: 1.0000 - val_loss: 0.0521 - val_acc: 1.0000
Epoch 7/50
16/16 [==============================] - 0s - loss: 0.0082 - acc: 1.0000 - val_loss: 0.0321 - val_acc: 1.0000
Epoch 8/50
16/16 [==============================] - 0s - loss: 0.0036 - acc: 1.0000 - val_loss: 0.0222 - val_acc: 1.0000
Epoch 9/50
16/16 [==============================] - 0s - loss: 0.0023 - acc: 1.0000 - val_loss: 0.0185 - val_acc: 1.0000
Epoch 10/50
16/16 [==============================] - 0s - loss: 0.0011 - acc: 1.0000 - val_loss: 0.0108 - val_acc: 1.0000
Epoch 11/50
16/16 [==============================] - 0s - loss: 5.6636e-04 - acc: 1.0000 - val_loss: 0.0087 - val_acc: 1.0000
Epoch 12/50
16/16 [==============================] - 0s - loss: 2.9463e-04 - acc: 1.0000 - val_loss: 0.0094 - val_acc: 1.0000
Epoch 13/50
16/16 [==============================] - 0s - loss: 1.5169e-04 - acc: 1.0000 - val_loss: 0.0072 - val_acc: 1.0000
Epoch 14/50
16/16 [==============================] - 0s - loss: 7.4001e-05 - acc: 1.0000 - val_loss: 0.0039 - val_acc: 1.0000
Epoch 15/50
16/16 [==============================] - 0s - loss: 3.9956e-05 - acc: 1.0000 - val_loss: 0.0034 - val_acc: 1.0000
Epoch 16/50
16/16 [==============================] - 0s - loss: 2.0384e-05 - acc: 1.0000 - val_loss: 0.0024 - val_acc: 1.0000
Epoch 17/50
16/16 [==============================] - 0s - loss: 1.0036e-05 - acc: 1.0000 - val_loss: 0.0026 - val_acc: 1.0000
Epoch 18/50
16/16 [==============================] - 0s - loss: 5.0962e-06 - acc: 1.0000 - val_loss: 0.0010 - val_acc: 1.0000
Epoch 19/50
16/16 [==============================] - 0s - loss: 2.7791e-06 - acc: 1.0000 - val_loss: 0.0011 - val_acc: 1.0000
Epoch 20/50
16/16 [==============================] - 0s - loss: 1.5646e-06 - acc: 1.0000 - val_loss: 0.0015 - val_acc: 1.0000
Epoch 21/50
16/16 [==============================] - 0s - loss: 8.6427e-07 - acc: 1.0000 - val_loss: 9.0825e-04 - val_acc: 1.0000
Epoch 22/50
16/16 [==============================] - 0s - loss: 4.3958e-07 - acc: 1.0000 - val_loss: 5.6370e-04 - val_acc: 1.0000
Epoch 23/50
16/16 [==============================] - 0s - loss: 2.5332e-07 - acc: 1.0000 - val_loss: 5.1226e-04 - val_acc: 1.0000
Epoch 24/50
16/16 [==============================] - 0s - loss: 1.6391e-07 - acc: 1.0000 - val_loss: 6.6560e-04 - val_acc: 1.0000
Epoch 25/50
16/16 [==============================] - 0s - loss: 1.3411e-07 - acc: 1.0000 - val_loss: 6.5456e-04 - val_acc: 1.0000
Epoch 26/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 27/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 28/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 29/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000
Epoch 30/50
16/16 [==============================] - 0s - loss: 1.1921e-07 - acc: 1.0000 - val_loss: 3.4316e-04 - val_acc: 1.0000

它收敛迅速并产生良好的结果。

另一方面,第二种方法给出了这个:

Epoch 1/50
24/24 [==============================] - 63s - loss: 0.7375 - acc: 0.7500 - val_loss: 0.7575 - val_acc: 0.6667
Epoch 2/50
24/24 [==============================] - 61s - loss: 0.6763 - acc: 0.7500 - val_loss: 1.5228 - val_acc: 0.5000
Epoch 3/50
24/24 [==============================] - 61s - loss: 0.7149 - acc: 0.7500 - val_loss: 3.5805 - val_acc: 0.3333
Epoch 4/50
24/24 [==============================] - 61s - loss: 0.6363 - acc: 0.7500 - val_loss: 1.5066 - val_acc: 0.5000
Epoch 5/50
24/24 [==============================] - 61s - loss: 0.6542 - acc: 0.7500 - val_loss: 1.8745 - val_acc: 0.6667
Epoch 6/50
24/24 [==============================] - 61s - loss: 0.7007 - acc: 0.7500 - val_loss: 1.5328 - val_acc: 0.5000
Epoch 7/50
24/24 [==============================] - 61s - loss: 0.6900 - acc: 0.7500 - val_loss: 3.6004 - val_acc: 0.3333
Epoch 8/50
24/24 [==============================] - 61s - loss: 0.6615 - acc: 0.7500 - val_loss: 1.5734 - val_acc: 0.5000
Epoch 9/50
24/24 [==============================] - 61s - loss: 0.6571 - acc: 0.7500 - val_loss: 3.0078 - val_acc: 0.6667
Epoch 10/50
24/24 [==============================] - 61s - loss: 0.5762 - acc: 0.7083 - val_loss: 3.6029 - val_acc: 0.5000
Epoch 11/50
24/24 [==============================] - 61s - loss: 0.6515 - acc: 0.7500 - val_loss: 5.8610 - val_acc: 0.3333
Epoch 12/50
24/24 [==============================] - 61s - loss: 0.6541 - acc: 0.7083 - val_loss: 2.4551 - val_acc: 0.5000
Epoch 13/50
24/24 [==============================] - 61s - loss: 0.6700 - acc: 0.7500 - val_loss: 2.9983 - val_acc: 0.6667
Epoch 14/50
24/24 [==============================] - 61s - loss: 0.6486 - acc: 0.7500 - val_loss: 3.6179 - val_acc: 0.5000
Epoch 15/50
24/24 [==============================] - 61s - loss: 0.6985 - acc: 0.6667 - val_loss: 5.8419 - val_acc: 0.3333
Epoch 16/50
24/24 [==============================] - 62s - loss: 0.6465 - acc: 0.7083 - val_loss: 2.5201 - val_acc: 0.5000
Epoch 17/50
24/24 [==============================] - 62s - loss: 0.6246 - acc: 0.7500 - val_loss: 2.9912 - val_acc: 0.6667
Epoch 18/50
24/24 [==============================] - 62s - loss: 0.6768 - acc: 0.7500 - val_loss: 3.6320 - val_acc: 0.5000
Epoch 19/50
24/24 [==============================] - 62s - loss: 0.5774 - acc: 0.7083 - val_loss: 5.8575 - val_acc: 0.3333
Epoch 20/50
24/24 [==============================] - 62s - loss: 0.6642 - acc: 0.7500 - val_loss: 2.5865 - val_acc: 0.5000
Epoch 21/50
24/24 [==============================] - 63s - loss: 0.6553 - acc: 0.7083 - val_loss: 2.9967 - val_acc: 0.6667
Epoch 22/50
24/24 [==============================] - 62s - loss: 0.6469 - acc: 0.7083 - val_loss: 3.6233 - val_acc: 0.5000
Epoch 23/50
24/24 [==============================] - 64s - loss: 0.6029 - acc: 0.7500 - val_loss: 5.8225 - val_acc: 0.3333
Epoch 24/50
24/24 [==============================] - 63s - loss: 0.6183 - acc: 0.7083 - val_loss: 2.5325 - val_acc: 0.5000
Epoch 25/50
24/24 [==============================] - 62s - loss: 0.6631 - acc: 0.7500 - val_loss: 2.9879 - val_acc: 0.6667
Epoch 26/50
24/24 [==============================] - 63s - loss: 0.6082 - acc: 0.7500 - val_loss: 3.6206 - val_acc: 0.5000
Epoch 27/50
24/24 [==============================] - 62s - loss: 0.6536 - acc: 0.7500 - val_loss: 5.7937 - val_acc: 0.3333
Epoch 28/50
24/24 [==============================] - 63s - loss: 0.5853 - acc: 0.7500 - val_loss: 2.6138 - val_acc: 0.5000
Epoch 29/50
24/24 [==============================] - 62s - loss: 0.5523 - acc: 0.7500 - val_loss: 3.0126 - val_acc: 0.6667
Epoch 30/50
24/24 [==============================] - 62s - loss: 0.7112 - acc: 0.7500 - val_loss: 3.7054 - val_acc: 0.5000

两种方法都使用相同的模型(Inception V4)。我的代码如下:

第一种方法(瓶颈特性):

from keras import backend as K
import inception_v4
import numpy as np
import cv2
import os

from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input

from keras.models import Model
os.environ['CUDA_VISIBLE_DEVICES'] = ''



v4 = inception_v4.create_model(weights='imagenet')


#v4.summary()
my_batch_size=1
train_data_dir ='//shared_directory/projects/try_CDxx/data/train/'
validation_data_dir ='//shared_directory/projects/try_CDxx/data/validation/'
top_model_weights_path= 'bottleneck_fc_model.h5'
class_num=2

img_width, img_height = 299, 299
#nb_train_samples=16
#nb_validation_samples=8
nb_epoch=50

main_input= v4.layers[1].input
main_output=v4.layers[-1].output
flatten_output= v4.layers[-2].output


model = Model(input=[main_input], output=[main_output, flatten_output])


def save_BN(model):   
#   
    datagen = ImageDataGenerator(rescale=1./255) # here!
#   
    generator = datagen.flow_from_directory(
            train_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            class_mode='categorical',
            shuffle=False)
    nb_train_samples = generator.classes.size       
    bottleneck_features_train = model.predict_generator(generator, nb_train_samples)
#
    np.save(open('bottleneck_flat_features_train.npy', 'wb'), bottleneck_features_train[1])

    np.save(open('bottleneck_train_labels.npy', 'wb'), generator.classes)

    generator = datagen.flow_from_directory(
            validation_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            class_mode='categorical',
            shuffle=False)

    nb_validation_samples = generator.classes.size
    bottleneck_features_validation = model.predict_generator(generator, nb_validation_samples)

    np.save(open('bottleneck_flat_features_validation.npy', 'wb'), bottleneck_features_validation[1])

    np.save(open('bottleneck_validation_labels.npy', 'wb'), generator.classes)


def train_top_model ():
    train_data = np.load(open('bottleneck_flat_features_train.npy'))
    train_labels = np.load(open('bottleneck_train_labels.npy'))
#
    validation_data = np.load(open('bottleneck_flat_features_validation.npy'))
    validation_labels = np.load(open('bottleneck_validation_labels.npy'))
    #
    top_m  = Sequential()
    top_m.add(Dense(class_num,input_shape=train_data.shape[1:], activation='softmax', name='top_dense1'))
    top_m.compile(optimizer='rmsprop', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
#
    top_m.fit(train_data, train_labels,

    nb_epoch=nb_epoch, batch_size=my_batch_size,
    validation_data=(validation_data, validation_labels))

    Dense_layer=top_m.layers[-1]
    my_weights=Dense_layer.get_weights()
    np.save(open('retrained_top_layer_weight.npy', 'wb'), my_weights)




save_BN(model)
train_top_model()

第二种方法(除最后一种外全部冻结)

from keras import backend as K
import inception_v4
import numpy as np
import cv2
import os

from keras import optimizers
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.layers import Activation, Dropout, Flatten, Dense, Input

from keras.models import Model
os.environ['CUDA_VISIBLE_DEVICES'] = ''


my_batch_size=1


train_data_dir ='//shared_directory/projects/try_CDxx/data/train/'
validation_data_dir ='//shared_directory/projects/try_CDxx/data/validation/'
top_model_path= 'tm_trained_model.h5'

img_width, img_height = 299, 299
num_classes=2
#nb_epoch=50
nb_epoch=50
nbr_train_samples = 24
nbr_validation_samples = 12


def train_top_model (num_classes):

    v4 = inception_v4.create_model(weights='imagenet')
    predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(v4.layers[-2].output) # replacing the 1001 categories dense layer with my own 
    main_input= v4.layers[1].input
    main_output=predictions
    t_model = Model(input=[main_input], output=[main_output])


    val_datagen = ImageDataGenerator(rescale=1./255)
    train_datagen  = ImageDataGenerator(rescale=1./255)  


    train_generator = train_datagen.flow_from_directory(
            train_data_dir,
            target_size = (img_width, img_height),
            batch_size = my_batch_size,
            shuffle = False,
            class_mode = 'categorical')

    validation_generator = val_datagen.flow_from_directory(
            validation_data_dir,
            target_size=(img_width, img_height),
            batch_size=my_batch_size,
            shuffle = False,
            class_mode = 'categorical') 
#
    for layer in t_model.layers:
        layer.trainable = False
    t_model.layers[-1].trainable=True
    t_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])


#
    t_model.fit_generator(
            train_generator,
            samples_per_epoch = nbr_train_samples,
            nb_epoch = nb_epoch,
            validation_data = validation_generator,
            nb_val_samples = nbr_validation_samples)
    t_model.save(top_model_path)    

#   print (t_model.trainable_weights)

train_top_model(num_classes)

我认为冻结所有网络但顶部和只训练顶部应该与使用所有网络但顶部创建在顶部之前存在的特征,然后训练一个新的密集层基本相同事物。

因此,我的代码要么不正确,要么对问题的思考不正确(或两者兼而有之……)

我究竟做错了什么?

感谢您的时间。

4

1 回答 1

2

这是一个非常巧妙的问题。这是因为Dropout您的第二种方法中的层。即使该层设置为不trainable-Dropout仍然可以通过更改输入来防止您的网络过度拟合。

尝试将您的代码更改为:

v4 = inception_v4.create_model(weights='imagenet')
predictions = Flatten()(v4.layers[-4].output)
predictions = Dense(output_dim=num_classes, activation='softmax', name="newDense")(predictions)

另外 - 由于BatchNormalization更改batch_size24.

这应该有效。

于 2017-03-07T21:17:41.073 回答