我有一个由大约 17,000 张图像组成的数据集,每个图像的大小为 3000x1700。
我有两种计算数据集均值和标准差的方法:逐像素方法和逐图像方法。逐像素方法简单地计算数据集中所有像素的均值和标准差(使用Welford 的在线算法,因为总共有太多像素,无法一次将它们全部存储在内存中)。image-wise 方法计算数据集中所有图像的均值和方差,然后对它们进行平均(并将 sqrt 应用于平均方差以获得标准偏差)。
一般来说,这两种方法显然不会返回相同的结果,因为按图像的方法会给不同大小的图像赋予相同的权重。但是,我的数据集中的所有图像都具有完全相同的大小,因此在我的情况下,这两种算法应该返回相同的均值和标准值。
import numpy as np
import os
from PIL import Image
class RunningStatisticsVar:
def __init__(self, ddof=0):
self.mean = 0
self.var = 0
self.std = 0
self._n = 0
self._s = 0
self._ddof = ddof
def update(self, values):
values = np.array(values, ndmin=1)
n = len(values)
self._n += n
old_mean = self.mean
delta = values - self.mean
self.mean += (delta / self._n).sum()
self._s += (delta * (values - self.mean)).sum()
self.var = self._s / (self._n - self._ddof) if self._n > self._ddof else 0
self.std = np.sqrt(self.var)
def __str__(self):
if self.std:
return f"{self.name} (\u03BC \u00B1 \u03C3): {self.mean} \u00B1 {self.std}"
else:
return f"{self.name}: {self.mean}"
def calculate_mean_and_std(source_dir, pixelwise=True):
source_fs = [os.path.join(source_dir, f) for f in os.listdir(source_dir)]
if pixelwise:
stats = {colour: RunningStatisticsVar() for colour in 'rgb'}
for source in source_fs:
_process_image_pixelwise(source, stats)
print(f"\u03BC: {[stats[colour].mean for colour in 'rgb']}")
print(f"\u03C3: {[stats[colour].std for colour in 'rgb']}")
else:
means, vars = zip(*(_process_image(source) for source in source_fs))
means = np.array(means)
vars = np.array(vars)
print(f"\u03BC: {[[means[:i+1, c].mean() for c in range(3)] for i in range(len(means))]}")
print(f"\u03C3: {[[np.sqrt(vars[:i+1, c].mean()) for c in range(3)] for i in range(len(vars))]}")
def _process_image(source_f):
img = np.array(Image.open(source_f))
return img.mean(axis=(0,1)), img.var(axis=(0,1))
def _process_image_pixelwise(source_f, stats):
img = np.array(Image.open(source_f))
for c, colour in enumerate('rgb'):
stats[colour].update(img[:, :, c].flatten())
calculate_mean_and_std('/path/to/dataset', True)
calculate_mean_and_std('/path/to/dataset', False)
现在,当运行此代码时,它开始正常,在它报告的第一张图像之后:
Pixelwise
μ: [106.0911049019608, 67.80728647058824, 45.90995117647062]
σ: [41.59660208236723, 34.5791272546266, 26.781512448936052]
Imagewise
μ: [106.09110490196079, 67.80728647058824, 45.909951176470585]
σ: [41.596602082409774, 34.579127254617084, 26.78151244893938]
σ 值只有轻微的偏差。然而,随着使用更多图像,此错误会变得越来越大。
Pixelwise
μ: [101.21394647058824, 65.62210166666667, 46.48841911764708]
σ: [40.41893639932673, 33.3795015706431, 26.946594699203892]
μ: [101.21668875816994, 65.61455176470588, 45.89977104575165]
σ: [39.97502569826382, 33.022993336061994, 27.14311280813053]
μ: [102.36255480392157, 65.52340710784313, 45.7435418137255]
σ: [39.24577507415308, 32.14645170321761, 26.46386419015868]
μ: [103.53420298039215, 66.6776919607843, 47.33793435294118]
σ: [39.50097527065126, 32.707463761715545, 27.354952039420294]
μ: [101.47019483660131, 64.78871330065358, 45.96562905228758]
σ: [39.004169349180536, 32.23864799529698, 26.801524930594407]
μ: [101.37475316526611, 64.89486834733893, 45.781395238095236]
σ: [39.01205310849703, 31.98696402185754, 26.461034312514027]
μ: [100.67556311274511, 64.35973938725489, 45.18712225490196]
σ: [39.4077244026275, 32.09956875358911, 26.236057166690983]
μ: [99.85794368191722, 63.74056908496731, 44.48019496732026]
σ: [39.711590893165045, 32.16450280203762, 26.02186219040715]
μ: [99.51202813725492, 63.552036745098036, 44.69082580392157]
σ: [39.67742805536442, 32.360665575523704, 26.47325733239239]
μ: [99.68222427807488, 63.67247614973262, 44.77697739750446]
σ: [39.704289274013554, 32.212222956614625, 26.25452243516258]
Imagewise
μ: [101.21394647058824, 65.62210166666667, 46.488419117647055]
σ: [40.12360583609732, 33.30789834971034, 26.940384940174805]
μ: [101.21668875816994, 65.61455176470588, 45.899771045751635]
σ: [39.77618485504031, 32.97475731200615, 27.126232255358936]
μ [102.36255480392157, 65.52340710784313, 45.74354181372549]
σ: [39.04354601878276, 32.108905824505584, 26.44949551028786]
μ: [103.53420298039217, 66.6776919607843, 47.33793435294118]
σ: [39.27047375795457, 32.596298448583816, 27.157260809967887]
μ: [101.47019483660131, 64.78871330065358, 45.96562905228759]
σ: [38.53431972571801, 31.865963477681383, 26.4560983662707]
μ: [101.37475316526611, 64.89486834733893, 45.78139523809524]
σ: [38.60904936868449, 31.66418210935895, 26.157487873733803]
μ: [100.6755631127451, 64.3597393872549, 45.18712225490196]
σ: [39.01506478558764, 31.78679779866631, 25.920704631785057]
μ: [99.85794368191722, 63.74056908496732, 44.48019496732026]
σ: [39.29746139951266, 31.839074510873296, 25.66162709110527]
μ: [99.5120281372549, 63.55203674509804, 44.69082580392156]
σ: [39.290881869376754, 32.064732326778504, 26.14723081778954]
μ: [99.68222427807487, 63.672476149732624, 44.77697739750445]
σ: [39.34959988736086, 31.939785095367306, 25.95437650452569]
最后,在整个数据集上运行该过程后,错误变得非常显着:
Pixelwise
μ: [101.8700454273955, 74.8429931459155, 64.76057496565133]
σ: [48.662522310391594, 41.730824001340146, 39.50453890881377]
Imagewise
μ: [101.87004542739516, 74.84299314591567, 64.76057496565105]
σ: [36.19880052359236, 32.38430878761532, 31.063098948011394]
如您所见,即使使用了整个数据集,这两种方法的均值也几乎相同。然而,标准差结果相差甚远,尽管它们对于两种方法应该是相同的。我的两种方法中的哪一种在这里有问题,为什么?