我正在对类实例列表(stl::list)进行在线破坏性集群(集群替换集群对象)。
背景
我当前的 percepUnits 列表是:stl::list<percepUnit> units;
对于每次迭代,我都会得到一个新的输入 percepUnits 列表,这些输入 percepUnitsstl::list<percepUnit> scratch;
需要与单位进行聚类。
我想保持固定数量的 percepUnits(所以 units.size() 是恒定的),所以对于每个新的临时 percepUnit,我需要将它与最近的 percepUnit 合并。下面是一个代码片段,它构建了一个dists
结构 ( percepUnitDist
) 的列表 (),其中包含指向从头和单位中的每对项目percepDist.scratchUnit = &(*scratchUnit);
及其percepDist.unit = &(*unit);
距离的指针。此外,对于从头开始的每个项目,我都会跟踪units 中的哪个项目具有最小的距离minDists
。
// For every scratch percepUnit:
for (scratchUnit = scratch.begin(); scratchUnit != scratch.end(); scratchUnit++) {
float minDist=2025.1172; // This is the max possible distance in unnormalized CIELuv, and much larger than the normalized dist.
// For every percepUnit:
for (unit = units.begin(); unit != units.end(); unit++) {
// compare pairs
float dist = featureDist(*scratchUnit, *unit, FGBG);
//cout << "distance: " << dist << endl;
// Put pairs in a structure that caches their distances
percepUnitDist percepDist;
percepDist.scratchUnit = &(*scratchUnit); // address of where scratchUnit points to.
percepDist.unit = &(*unit);
percepDist.dist = dist;
// Figure out the percepUnit that is closest to this scratchUnit.
if (dist < minDist)
minDist = dist;
dists.push_back(percepDist); // append dist struct
}
minDists.push_back(minDist); // append the min distance to the nearest percepUnit for this particular scratchUnit.
}
所以现在我只需要遍历其中的percepUnitDist
项目dists
并将距离与最小距离进行匹配,以确定哪个 percepUnit 应该与哪个 percepUnit 合并。合并过程mergePerceps()
会创建一个新的 percepUnit,它是从头和单位中的“父”percepUnits 的加权平均值。
问题
我想用由 mergePerceps() 构造的新 percepUnit 替换单位列表中的实例,但我想在循环遍历 percepUnitDists 的上下文中这样做。这是我当前的代码:
// Loop through dists and merge all the closest pairs.
// Loop through all dists
for (distIter = dists.begin(); distIter != dists.end(); distIter++) {
// Loop through all minDists for each scratchUnit.
for (minDistsIter = minDists.begin(); minDistsIter != minDists.end(); minDistsIter++) {
// if this is the closest cluster, and the closest cluster has not already been merged, and the scratch has not already been merged.
if (*minDistsIter == distIter->dist and not distIter->scratchUnit->remove) {
percepUnit newUnit;
mergePerceps(*(distIter->scratchUnit), *(distIter->unit), newUnit, FGBG);
*(distIter->unit) = newUnit; // replace the cluster with the new merged version.
distIter->scratchUnit->remove = true;
}
}
}
我认为我可以通过 percepUnitDist 指针用新的 percepUnit 实例替换单位中的实例*(distIter->unit) = newUnit;
,但这似乎不起作用,因为我看到内存泄漏,这意味着单位中的实例没有被替换。
如何删除单位列表中的 percepUnit 并将其替换为新的 percepUnit 实例,以便新单位位于同一位置?
编辑1
这是 percepUnit 类。注意 cv::Mat 成员。以下是它所依赖的 mergePerceps() 函数和 mergeImages() 函数:
// Function to construct an accumulation.
void clustering::mergeImages(Mat &scratch, Mat &unit, cv::Mat &merged, const string maskOrImage, const string FGBG, const float scratchWeight, const float unitWeight) {
int width, height, type=CV_8UC3;
Mat scratchImagePad, unitImagePad, scratchImage, unitImage;
// use the resolution and aspect of the largest of the pair.
if (unit.cols > scratch.cols)
width = unit.cols;
else
width = scratch.cols;
if (unit.rows > scratch.rows)
height = unit.rows;
else
height = scratch.rows;
if (maskOrImage == "mask")
type = CV_8UC1; // single channel mask
else if (maskOrImage == "image")
type = CV_8UC3; // three channel image
else
cout << "maskOrImage is not 'mask' or 'image'\n";
merged = Mat(height, width, type, Scalar::all(0));
scratchImagePad = Mat(height, width, type, Scalar::all(0));
unitImagePad = Mat(height, width, type, Scalar::all(0));
// weight images before summation.
// because these pass by reference, they mess up the images in memory!
scratch *= scratchWeight;
unit *= unitWeight;
// copy images into padded images.
scratch.copyTo(scratchImagePad(Rect((scratchImagePad.cols-scratch.cols)/2,
(scratchImagePad.rows-scratch.rows)/2,
scratch.cols,
scratch.rows)));
unit.copyTo(unitImagePad(Rect((unitImagePad.cols-unit.cols)/2,
(unitImagePad.rows-unit.rows)/2,
unit.cols,
unit.rows)));
merged = scratchImagePad+unitImagePad;
}
// Merge two perceps and return a new percept to replace them.
void clustering::mergePerceps(percepUnit scratch, percepUnit unit, percepUnit &mergedUnit, const string FGBG) {
Mat accumulation;
Mat accumulationMask;
Mat meanColour;
int x, y, w, h, area;
float l,u,v;
int numMerges=0;
std::vector<float> featuresVar; // Normalized, Sum, Variance.
//float featuresVarMin, featuresVarMax; // min and max variance accross all features.
float scratchWeight, unitWeight;
if (FGBG == "FG") {
// foreground percepts don't get merged as much.
scratchWeight = 0.65;
unitWeight = 1-scratchWeight;
} else {
scratchWeight = 0.85;
unitWeight = 1-scratchWeight;
}
// Images TODO remove the meanColour if needbe.
mergeImages(scratch.image, unit.image, accumulation, "image", FGBG, scratchWeight, unitWeight);
mergeImages(scratch.mask, unit.mask, accumulationMask, "mask", FGBG, scratchWeight, unitWeight);
mergeImages(scratch.meanColour, unit.meanColour, meanColour, "image", "FG", scratchWeight, unitWeight); // merge images
// Position and size.
x = (scratch.x1*scratchWeight) + (unit.x1*unitWeight);
y = (scratch.y1*scratchWeight) + (unit.y1*unitWeight);
w = (scratch.w*scratchWeight) + (unit.w*unitWeight);
h = (scratch.h*scratchWeight) + (unit.h*unitWeight);
// area
area = (scratch.area*scratchWeight) + (unit.area*unitWeight);
// colour
l = (scratch.l*scratchWeight) + (unit.l*unitWeight);
u = (scratch.u*scratchWeight) + (unit.u*unitWeight);
v = (scratch.v*scratchWeight) + (unit.v*unitWeight);
// Number of merges
if (scratch.numMerges < 1 and unit.numMerges < 1) { // both units are patches
numMerges = 1;
} else if (scratch.numMerges < 1 and unit.numMerges >= 1) { // unit A is a patch, B a percept
numMerges = unit.numMerges + 1;
} else if (scratch.numMerges >= 1 and unit.numMerges < 1) { // unit A is a percept, B a patch.
numMerges = scratch.numMerges + 1;
cout << "merged scratch??" <<endl;
// TODO this may be an impossible case.
} else { // both units are percepts
numMerges = scratch.numMerges + unit.numMerges;
cout << "Merging two already merged Percepts" <<endl;
// TODO this may be an impossible case.
}
// Create unit.
mergedUnit = percepUnit(accumulation, accumulationMask, x, y, w, h, area); // time is the earliest value in times?
mergedUnit.l = l; // members not in the constrcutor.
mergedUnit.u = u;
mergedUnit.v = v;
mergedUnit.numMerges = numMerges;
mergedUnit.meanColour = meanColour;
mergedUnit.pActivated = unit.pActivated; // new clusters retain parent's history of activation.
mergedUnit.scratch = false;
mergedUnit.habituation = unit.habituation; // we inherent the habituation of the cluster we merged with.
}
编辑2
更改复制和赋值运算符会产生性能副作用,并且似乎无法解决问题。所以我添加了一个自定义函数来进行替换,就像复制操作员复制每个成员并确保这些副本很深。问题是我仍然会泄漏。
所以我改变了这一行:*(distIter->unit) = newUnit;
对此:(*(distIter->unit)).clone(newUnit)
其中clone方法如下:
// Deep Copy of members
void percepUnit::clone(const percepUnit &source) {
// Deep copy of Mats
this->image = source.image.clone();
this->mask = source.mask.clone();
this->alphaImage = source.alphaImage.clone();
this->meanColour = source.meanColour.clone();
// shallow copies of everything else
this->alpha = source.alpha;
this->fadingIn = source.fadingIn;
this->fadingHold = source.fadingHold;
this->fadingOut = source.fadingOut;
this->l = source.l;
this->u = source.u;
this->v = source.v;
this->x1 = source.x1;
this->y1 = source.y1;
this->w = source.w;
this->h = source.h;
this->x2 = source.x2;
this->y2 = source.y2;
this->cx = source.cx;
this->cy = source.cy;
this->numMerges = source.numMerges;
this->id = source.id;
this->area = source.area;
this->features = source.features;
this->featuresNorm = source.featuresNorm;
this->remove = source.remove;
this->fgKnockout = source.fgKnockout;
this->colourCalculated = source.colourCalculated;
this->normalized = source.normalized;
this->activation = source.activation;
this->activated = source.activated;
this->pActivated = source.pActivated;
this->habituation = source.habituation;
this->scratch = source.scratch;
this->FGBG = source.FGBG;
}
然而,我仍然看到内存增加。如果我注释掉那条替换行,则不会发生增加。所以我还是卡住了。
编辑3
如果我在上面的函数中禁用 cv::Mat 克隆代码,我可以防止内存增加:
// Deep Copy of members
void percepUnit::clone(const percepUnit &source) {
/* try releasing Mats first?
// No effect on memory increase, but the refCount is decremented.
this->image.release();
this->mask.release();
this->alphaImage.release();
this->meanColour.release();*/
/* Deep copy of Mats
this->image = source.image.clone();
this->mask = source.mask.clone();
this->alphaImage = source.alphaImage.clone();
this->meanColour = source.meanColour.clone();*/
// shallow copies of everything else
this->alpha = source.alpha;
this->fadingIn = source.fadingIn;
this->fadingHold = source.fadingHold;
this->fadingOut = source.fadingOut;
this->l = source.l;
this->u = source.u;
this->v = source.v;
this->x1 = source.x1;
this->y1 = source.y1;
this->w = source.w;
this->h = source.h;
this->x2 = source.x2;
this->y2 = source.y2;
this->cx = source.cx;
this->cy = source.cy;
this->numMerges = source.numMerges;
this->id = source.id;
this->area = source.area;
this->features = source.features;
this->featuresNorm = source.featuresNorm;
this->remove = source.remove;
this->fgKnockout = source.fgKnockout;
this->colourCalculated = source.colourCalculated;
this->normalized = source.normalized;
this->activation = source.activation;
this->activated = source.activated;
this->pActivated = source.pActivated;
this->habituation = source.habituation;
this->scratch = source.scratch;
this->FGBG = source.FGBG;
}
编辑4
虽然我仍然无法解释这个问题,但我确实注意到了另一个提示。我意识到,如果我不规范化那些通过 featureDist() 用于集群的功能(但继续克隆 cv::Mats),也可以停止这种泄漏。真正奇怪的是,我完全重写了该代码,但问题仍然存在。
这是 featureDist 函数:
float clustering::featureDist(percepUnit unitA, percepUnit unitB, const string FGBG) {
float distance=0;
if (FGBG == "BG") {
for (unsigned int i=0; i<unitA.featuresNorm.rows; i++) {
distance += pow(abs(unitA.featuresNorm.at<float>(i) - unitB.featuresNorm.at<float>(i)),0.5);
//cout << "unitA.featuresNorm[" << i << "]: " << unitA.featuresNorm[i] << endl;
//cout << "unitB.featuresNorm[" << i << "]: " << unitB.featuresNorm[i] << endl;
}
// for FG, don't use normalized colour features.
// TODO To include the area use i=4
} else if (FGBG == "FG") {
for (unsigned int i=4; i<unitA.features.rows; i++) {
distance += pow(abs(unitA.features.at<float>(i) - unitB.features.at<float>(i)),0.5);
}
} else {
cout << "FGBG argument was not FG or BG, returning 0." <<endl;
return 0;
}
return pow(distance,2);
}
特征曾经是浮点数的向量,因此规范化代码如下:
void clustering::normalize(list<percepUnit> &scratch, list<percepUnit> &units) {
list<percepUnit>::iterator unit;
list<percepUnit*>::iterator unitPtr;
vector<float> min,max;
list<percepUnit*> masterList; // list of pointers.
// generate pointers
for (unit = scratch.begin(); unit != scratch.end(); unit++)
masterList.push_back(&(*unit)); // add pointer to where unit points to.
for (unit = units.begin(); unit != units.end(); unit++)
masterList.push_back(&(*unit)); // add pointer to where unit points to.
int numFeatures = masterList.front()->features.size(); // all percepts have the same number of features.
min.resize(numFeatures); // allocate for the number of features we have.
max.resize(numFeatures);
// Loop through all units to get feature values
for (int i=0; i<numFeatures; i++) {
min[i] = masterList.front()->features[i]; // starting point.
max[i] = min[i];
// calculate min and max for each feature.
for (unitPtr = masterList.begin(); unitPtr != masterList.end(); unitPtr++) {
if ((*unitPtr)->features[i] < min[i])
min[i] = (*unitPtr)->features[i];
if ((*unitPtr)->features[i] > max[i])
max[i] = (*unitPtr)->features[i];
}
}
// Normalize features according to min/max.
for (int i=0; i<numFeatures; i++) {
for (unitPtr = masterList.begin(); unitPtr != masterList.end(); unitPtr++) {
(*unitPtr)->featuresNorm[i] = ((*unitPtr)->features[i]-min[i]) / (max[i]-min[i]);
(*unitPtr)->normalized = true;
}
}
}
我将特征类型更改为 cv::Mat 以便可以使用 opencv 归一化函数,因此我重写了归一化函数,如下所示:
void clustering::normalize(list<percepUnit> &scratch, list<percepUnit> &units) {
Mat featureMat = Mat(1,units.size()+scratch.size(), CV_32FC1, Scalar(0));
list<percepUnit>::iterator unit;
// For each feature
for (int i=0; i< units.begin()->features.rows; i++) {
// for each unit in units
int j=0;
float value;
for (unit = units.begin(); unit != units.end(); unit++) {
// Populate featureMat j is the unit index, i is the feature index.
value = unit->features.at<float>(i);
featureMat.at<float>(j) = value;
j++;
}
// for each unit in scratch
for (unit = scratch.begin(); unit != scratch.end(); unit++) {
// Populate featureMat j is the unit index, i is the feature index.
value = unit->features.at<float>(i);
featureMat.at<float>(j) = value;
j++;
}
// Normalize this featureMat in place
cv::normalize(featureMat, featureMat, 0, 1, NORM_MINMAX);
// set normalized values in percepUnits from featureMat
// for each unit in units
j=0;
for (unit = units.begin(); unit != units.end(); unit++) {
// Populate percepUnit featuresNorm, j is the unit index, i is the feature index.
value = featureMat.at<float>(j);
unit->featuresNorm.at<float>(i) = value;
j++;
}
// for each unit in scratch
for (unit = scratch.begin(); unit != scratch.end(); unit++) {
// Populate percepUnit featuresNorm, j is the unit index, i is the feature index.
value = featureMat.at<float>(j);
unit->featuresNorm.at<float>(i) = value;
j++;
}
}
}
我无法理解 mergePercepts 和规范化之间的相互作用,特别是因为规范化是一个完全重写的函数。
更新
Massif 和我的 /proc 内存报告不同意。Massif 说标准化对内存使用没有影响,只有注释掉 percepUnit::clone() 操作才能绕过泄漏。
这是所有代码,以防交互在我遗漏的其他地方。
这是相同代码的另一个版本,删除了对 OpenCV GPU 的依赖,以方便测试...