简短版本:为什么循环迭代(compounds+k)->spectra->peaks
后会发生变化for
(约 2500 个化合物中有 4 个)?
长版:我有一个功能可以检查我的所有数据(在结构色谱图 ( chrom
) 中)并将那些type == 1
直接添加到新结构 ( compounds
) 中,type == 2
如果剩余数据中存在高度相似的色谱图并求和/平均,则检查色谱图进入 1 个条目compounds
。当我最初使用当时拥有的数据集编写它时,该程序运行良好,但最近的数据我遇到了一个错误,其中跟踪频谱中有多少“值”的整数在包含for loop
结束后以某种方式重置为 0。我希望在阅读我的代码后会更清楚一点(特别注意最后证明问题的 2 个打印件)。
chromatogram*
spectral_matcher(chromatogram* chrom, arguments* args) {
int i, j, k = 0, l, m, size, counter = 0;
float low_mass, high_mass, low_time, high_time;
chromatogram* compounds;
compounds = calloc(MAX_SPECTRA,sizeof(chromatogram));
for (i = 0; i < chrom->hits-1; i++) {
if ( (chrom+i)->type == 1) {
/* Adding the MS1 spectrum to output set */
chrom_copy(chrom, compounds, i, k);
k++;
} else if ((chrom+i)->used != 1 && (chrom+i)->type == 2) {
/* Adding the MSn spectrum to output set */
chrom_copy(chrom, compounds, i, k);
/* Acquiring search paramenters */
low_mass = (chrom+i)->precursor - args->mass_tolerance;
high_mass = (chrom+i)->precursor + args->mass_tolerance;
low_time = (chrom+i)->time - args->time_tolerance;
high_time = (chrom+i)->time + args->time_tolerance;
/* Performing search for matching spectra */
for (j = i+1; j < chrom->hits; j++) {
if ( (chrom+j)->type == 2 && (chrom+j)->precursor > low_mass && (chrom+j)->precursor < high_mass && (chrom+j)->time > low_time && (chrom+j)->time < high_time && (chrom+i)->spectra->peaks > 10 && (chrom+j)->spectra->peaks > 10 && (chrom+j)->used != 1) {
/* the KS test can only be performed if the previous if statement was true */
if (pdf_ks((chrom+i)->pdf,(chrom+j)->pdf, 1.0) == 1) {
if (args->verbose == 1) {
printf("Matching spectrum %i with %i into %i\n",i, j, k);
}
// De magicks - Photo Finish
counter++;
l = (compounds+k)->spectra->peaks;
size = (compounds+k)->spectra->peaks + (chrom+j)->spectra->peaks;
(compounds+k)->spectra->peaks = size;
m = 0;
/* `l` is at the end of original spectra, append values starting from `l` */
for (; l < size; l++) {
((compounds+k)->spectra+l)->mz_value = ((chrom+j)->spectra+m)->mz_value;
((compounds+k)->spectra+l)->int_value = ((chrom+j)->spectra+m)->int_value;
m++;
}
// set the 'matched' spectrum to NULL so there will be no duplicates
(chrom+j)->used = 1;
}
}
}
k++;
}
/* k was incremented in either the if or else if so doing -1 here */
printf("%i: values [ %i ] contains a value set [ %f - %f ]\n", k-1, (compounds+k-1)->spectra->peaks, ((compounds+k-1)->spectra+5000)->mz_value,((compounds+k-1)->spectra+5000)->int_value);
}
printf("BREAKPOINT\n");
printf("%i spectra summed\n",counter);
compounds->hits = k;
for (i = 0; i < compounds->hits; i++) {
printf("%i: values [ %i ] contains a value set [ %f - %f ]\n", i, (compounds+i)->spectra->peaks, ((compounds+i)->spectra+5000)->mz_value,((compounds+i)->spectra+5000)->int_value);
}
exit(0);
return(compounds);
}
我知道 4 种化合物会产生我之前解释过的奇怪行为,所以这里是输出中的匹配行:
736: values [ 16481 ] contains a value set [ 765.000000 - 0.000000 ]
847: values [ 16481 ] contains a value set [ 765.000000 - 5843.000000 ]
1810: values [ 16481 ] contains a value set [ 765.000000 - 0.000000 ]
2212: values [ 16481 ] contains a value set [ 765.000000 - 0.000000 ]
BREAKPOINT
736: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
847: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
1810: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
2212: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
数组中的值甚至似乎会根据这个结果发生变化。
然而,4 个特殊值周围的值仍然正确:
735: values [ 44801 ] contains a value set [ 556.250000 - 0.000000 ]
736: values [ 16481 ] contains a value set [ 765.000000 - 0.000000 ]
737: values [ 131848 ] contains a value set [ 765.000000 - 0.000000 ]
BREAKPOINT
735: values [ 44801 ] contains a value set [ 556.250000 - 0.000000 ]
736: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
737: values [ 131848 ] contains a value set [ 765.000000 - 0.000000 ]
如果有人对接下来要检查的内容有任何点击或提示,我将不胜感激。
-- 5 月 16 日 (4:20) --
i
我尝试通过添加if (i == 806) { break; }
到代码中手动打破特定值的 for 循环来查看数据更改的位置。这产生了:
736: values [ 16481 ] contains a value set [ 765.000000 - 0.000000 ]
BREAKPOINT
736: values [ 0 ] contains a value set [ 765.000000 - 905.625000 ]
——5月17日——
我还检查了 i 和 k 计数器是否在做一些奇怪的事情,但它们看起来非常好(在 for 循环内):
I: 803 K: 734
I: 804 K: 735
I: 805 K: 736 /* The iteration which shows wrong data AFTER the for loop closes */
I: 806 K: 737
I: 807 K: 738
I: 808 K: 739