3

我正在尝试使用 python 通过主成分分析 (PCA) 实现人脸识别。步骤之一是T通过减去平均人脸向量来归一化输入(测试)图像mn = T - m

这是我的代码:

#Step1: put database images into a 2D array
filenames = glob.glob('C:\\Users\\Karim\\Downloads\\att_faces\\New folder/*.pgm')
filenames.sort()
img = [Image.open(fn).convert('L').resize((90, 90)) for fn in filenames]
images = np.asarray([np.array(im).flatten() for im in img])

#Step 2: find the mean image and the mean-shifted input images
m = images.mean(axis=0)
shifted_images = images - m

#Step 7: input image
input_image = Image.open('C:\\Users\\Karim\\Downloads\\att_faces\\1.pgm').convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - mean_image

但我收到一个错误Traceback (most recent call last): File "C:/Users/Karim/Desktop/Bachelor 2/New folder/new3.py", line 46, in <module> n = T - m ValueError: operands could not be broadcast together with shapes (90,90) (8100)

4

2 回答 2

3

mean_image为扁平数组计算:

images = np.asarray([np.array(im).flatten() for im in img])
mean_image = images.mean(axis=0)

并且input_image是 90x90。因此错误。您也应该展平输入图像,或者不展平原始图像(我不太明白为什么要这样做),或者mean_image仅针对此操作将大小调整为 90x90。

于 2013-04-15T13:21:41.173 回答
3

正如@Lev 所说,你已经扁平化了你的数组。您实际上不需要这样做来执行平均值。假设您有一组 2 个 3x4 图像,那么您将拥有如下内容:

In [291]: b = np.random.rand(2,3,4)

In [292]: b.shape
Out[292]: (2, 3, 4)

In [293]: b
Out[293]: 
array([[[ 0.18827554,  0.11340471,  0.45185287,  0.47889188],
        [ 0.35961448,  0.38316556,  0.73464482,  0.37597429],
        [ 0.81647845,  0.28128797,  0.33138755,  0.55403119]],

       [[ 0.92025024,  0.55916671,  0.23892798,  0.59253267],
        [ 0.15664109,  0.12457157,  0.28139198,  0.31634361],
        [ 0.33420446,  0.27599807,  0.40336601,  0.67738928]]])

在第一个轴上执行平均值,保留数组的形状:

In [300]: b.mean(0)
Out[300]: 
array([[ 0.55426289,  0.33628571,  0.34539042,  0.53571227],
       [ 0.25812778,  0.25386857,  0.5080184 ,  0.34615895],
       [ 0.57534146,  0.27864302,  0.36737678,  0.61571023]])

In [301]: b - b.mean(0)
Out[301]: 
array([[[-0.36598735, -0.222881  ,  0.10646245, -0.0568204 ],
        [ 0.10148669,  0.129297  ,  0.22662642,  0.02981534],
        [ 0.24113699,  0.00264495, -0.03598923, -0.06167904]],

       [[ 0.36598735,  0.222881  , -0.10646245,  0.0568204 ],
        [-0.10148669, -0.129297  , -0.22662642, -0.02981534],
        [-0.24113699, -0.00264495,  0.03598923,  0.06167904]]])

对于许多用途,这也比将图像保存为数组列表要快,因为 numpy 操作是在一个数组上完成的,而不是通过数组列表完成的。大多数方法,如mean,等都cov接受axis参数,您可以列出所有维度来执行它而无需展平。

要将其应用于您的脚本,我会做这样的事情,保持原始维度:

images = np.asarray([Image.open(fn).convert('L').resize((90, 90)) for fn in filenames])
# so images.shape = (len(filenames), 90, 90)

m = images.mean(0)
# numpy broadcasting will automatically subract the (90, 90) mean image from each of the `images`
# m.shape = (90, 90)
# shifted_images.shape = images.shape = (len(filenames), 90, 90)
shifted_images = images - m 

#Step 7: input image
input_image = Image.open(...).convert('L').resize((90, 90))
T = np.asarray(input_image)
n = T - m

作为最后的评论,如果速度是一个问题,使用 np.dstack 加入图像会更快:

In [354]: timeit b = np.asarray([np.empty((50,100)) for i in xrange(1000)])
1 loops, best of 3: 824 ms per loop

In [355]: timeit b = np.dstack([np.empty((50,100)) for i in xrange(1000)]).transpose(2,0,1)
10 loops, best of 3: 118 ms per loop

但加载图像可能需要大部分时间,如果是这种情况,您可以忽略这一点。

于 2013-04-15T13:41:42.353 回答