python - Caffe 中的图像分类总是返回相同的类别

Question

我对 caffe 中的图像分类有疑问。我使用 imagenet 模型（来自 caffe 教程）对我创建的数据进行分类，但我总是得到相同的分类结果（相同的类，即类 3）。这就是我进行的方式：

我使用 caffe for windows 和 Python 作为界面

(1) 我收集数据。我的样本图像（训练和测试）是大小为 5x5x3 (RGB) uint8 的图像，因此其像素值范围为 0-255。
(2) 我将它们调整为 imagenet 需要的大小：256x256x3。因此我在matlab中使用resize函数（最近邻插值）。
(3) 我创建了一个 LevelDB 和 image_mean。
(4) 训练我的网络（3000 次迭代）。我在 imagenet 定义中更改的唯一参数是平均图像和 LevelDB 的路径。我得到的结果：

I0428 12:38:04.350100  3236 solver.cpp:245]     Train net output #0: loss = 1.91102 (* 1 = 1.91102 loss)
I0428 12:38:04.350100  3236 sgd_solver.cpp:106] Iteration 2900, lr = 0.0001
I0428 12:38:30.353361  3236 solver.cpp:229] Iteration 2920, loss = 2.18008
I0428 12:38:30.353361  3236 solver.cpp:245]     Train net output #0: loss = 2.18008 (* 1 = 2.18008 loss)
I0428 12:38:30.353361  3236 sgd_solver.cpp:106] Iteration 2920, lr = 0.0001
I0428 12:38:56.351630  3236 solver.cpp:229] Iteration 2940, loss = 1.90925
I0428 12:38:56.351630  3236 solver.cpp:245]     Train net output #0: loss = 1.90925 (* 1 = 1.90925 loss)
I0428 12:38:56.351630  3236 sgd_solver.cpp:106] Iteration 2940, lr = 0.0001
I0428 12:39:22.341891  3236 solver.cpp:229] Iteration 2960, loss = 1.98917
I0428 12:39:22.341891  3236 solver.cpp:245]     Train net output #0: loss = 1.98917 (* 1 = 1.98917 loss)
I0428 12:39:22.341891  3236 sgd_solver.cpp:106] Iteration 2960, lr = 0.0001
I0428 12:39:48.334151  3236 solver.cpp:229] Iteration 2980, loss = 2.45919
I0428 12:39:48.334151  3236 solver.cpp:245]     Train net output #0: loss = 2.45919 (* 1 = 2.45919 loss)
I0428 12:39:48.334151  3236 sgd_solver.cpp:106] Iteration 2980, lr = 0.0001
I0428 12:40:13.040398  3236 solver.cpp:456] Snapshotting to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel
I0428 12:40:15.080418  3236 sgd_solver.cpp:273] Snapshotting solver state to binary proto file Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.solverstate
I0428 12:40:15.820426  3236 solver.cpp:318] Iteration 3000, loss = 2.08741
I0428 12:40:15.820426  3236 solver.cpp:338] Iteration 3000, Testing net (#0)
I0428 12:41:50.398375  3236 solver.cpp:406]     Test net output #0: accuracy = 0.11914
I0428 12:41:50.398375  3236 solver.cpp:406]     Test net output #1: loss = 2.71476 (* 1 = 2.71476 loss)
I0428 12:41:50.398375  3236 solver.cpp:323] Optimization Done.
I0428 12:41:50.398375  3236 caffe.cpp:222] Optimization Done.

(5) 我在 Python 中运行以下代码来对单个图像进行分类：

# set up Python environment: numpy for numerical routines, and matplotlib for plotting
import numpy as np
import matplotlib.pyplot as plt
# display plots in this notebook


# set display defaults
plt.rcParams['figure.figsize'] = (10, 10)        # large images
plt.rcParams['image.interpolation'] = 'nearest'  # don't interpolate: show square pixels
plt.rcParams['image.cmap'] = 'gray'  # use grayscale output rather than a (potentially misleading) color heatmap

# The caffe module needs to be on the Python path;
#  we'll add it here explicitly.
import sys
caffe_root = '../'  # this file should be run from {caffe_root}/examples (otherwise change this line)
sys.path.insert(0, caffe_root + 'python')

import caffe
# If you get "No module named _caffe", either you have not built pycaffe or you have the wrong path.


caffe.set_mode_cpu()

model_def = 'C:/Caffe/caffe-windows-master/models/bvlc_reference_caffenet/deploy.prototxt'
model_weights = 'Z:/DeepLearning/S1S2/Stockholm/models_iter_3000.caffemodel'

net = caffe.Net(model_def,      # defines the structure of the model
                model_weights,  # contains the trained weights
                caffe.TEST)     # use test mode (e.g., don't perform dropout)

#load mean image file and convert it to a .npy file--------------------------------
blob = caffe.proto.caffe_pb2.BlobProto()
data = open('Z:/DeepLearning/S1S2/Stockholm/S1S2train256.binaryproto',"rb").read()
blob.ParseFromString(data)
nparray = caffe.io.blobproto_to_array(blob)
f = file('Z:/DeepLearning/PythonCalssification/imgmean.npy',"wb")
np.save(f,nparray)

f.close()


# load the mean ImageNet image (as distributed with Caffe) for subtraction
mu1 = np.load('Z:/DeepLearning/PythonCalssification/imgmean.npy')
mu1 = mu1.squeeze()
mu = mu1.mean(1).mean(1)  # average over pixels to obtain the mean (BGR) pixel values
print 'mean-subtracted values:', zip('BGR', mu)
print 'mean shape: ',mu1.shape
print 'data shape: ',net.blobs['data'].data.shape

# create transformer for the input called 'data'
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})

# set the size of the input (we can skip this if we're happy

transformer.set_transpose('data', (2,0,1))  # move image channels to outermost dimension
transformer.set_mean('data', mu)            # subtract the dataset-mean value in each channel
transformer.set_raw_scale('data', 255)      # rescale from [0, 1] to [0, 255]
transformer.set_channel_swap('data', (2,1,0))  # swap channels from RGB to BGR

# set the size of the input (we can skip this if we're happy
#  with the default; we can also change it later, e.g., for different batch sizes)
net.blobs['data'].reshape(50,        # batch size
                          3,         # 3-channel (BGR) images
                          227, 227)  # image size is 227x227

#load image
image = caffe.io.load_image('Z:/DeepLearning/PythonCalssification/380.tiff')
transformed_image = transformer.preprocess('data', image)
#plt.imshow(image)

# copy the image data into the memory allocated for the net
net.blobs['data'].data[...] = transformed_image

### perform classification
output = net.forward()

output_prob = output['prob'][0]  # the output probability vector for the first image in the batch

print 'predicted class is:', output_prob.argmax()

我使用哪个输入图像并不重要，我总是得到“3”类作为分类结果。这是我训练/分类的示例图像：如果有人知道出了什么问题，我会很高兴？提前致谢！

score 2 · Accepted Answer

如果你总是得到相同的课程，这意味着 NN 没有经过适当的训练。

确保训练集是平衡的。当一个分类器总是预测同一个类时，通常是因为一个类被其他类过度表示。例如，假设您有两个类，第一个由 95 个实例表示，第二个由 5 个实例表示。如果分类器将所有内容分类为属于第一个类，那么他已经正确率为 95%。
一件显而易见的事情是您应该对输入进行归一化image / 255.0 - 0.5，它将使输入居中并减小标准偏差。
之后，确保训练集中的数据至少是 NN 中权重的 4 倍。
最后但并非最不重要的一点是，确保训练集被正确打乱。

python - Caffe 中的图像分类总是返回相同的类别

1 回答 1

Related

Reference