2

我最近将一些慢速 python 代码转换为 C 扩展。它工作得很好,除了它在第 162 次调用时生成段错误,就在 return 语句处。

这是它的工作原理。有一次,在对要计算的函数的所有调用之前,我将数据加载到内存中(记住 INCREF 父对象):

static PyObject *grm_loadDosage(PyObject *self, PyObject *args) {
 /***
 Load the dosage matrix into global memory
 Global variables: DOSAGES_PYMAT - will be a pointer to the PyArrayObject of the dosage array (must incref this)
                   DOSAGES - will be a pointer to the double array underlying DOSAGES_PYMAT
 ***/
 PyArrayObject *dosageMatrixArr;
 if ( ! PyArg_ParseTuple(args,"O",&DOSAGES_PYMAT_OBJ) ) return NULL;
 if ( NULL == DOSAGES_PYMAT_OBJ ) return NULL;

 Py_INCREF(DOSAGES_PYMAT_OBJ);

 /* from PyObject to Python Array */
 dosageMatrixArr = pymatrix(DOSAGES_PYMAT_OBJ);
 /* get the row and col sizes */
 N_VARIANTS = dosageMatrixArr->dimensions[0];
 N_SAMPLES = dosageMatrixArr->dimensions[1];
 DOSAGES = pymatrix_to_Carray(dosageMatrixArr);
 Py_RETURN_TRUE;
}

(有关 C 数组方法,请参阅http://www.scipy.org/Cookbook/C_Extensions/NumPy_arrays )。然后我在我将从 python 调用的函数中引用加载的 double[][], DOSAGES:

static PyObject *grm_calcdistance(PyObject *self, PyObject *args) {
 /** Given column indeces (samples) of DOSAGES, and an array of row indeces (the variants missing for one or both),
     calculate the distance **/
 int samI,samJ,nMissing, *missing;
 PyObject *missingObj;
 PyArrayObject *missingArr;
 printf("debug1\n");
 if ( ! PyArg_ParseTuple(args,"iiOi",&samI,&samJ,&missingObj,&nMissing) ) return NULL;
 if ( NULL == missingObj ) return NULL;
 missingArr = pyvector(missingObj);
 missing = pyvector_to_Carray(missingArr);
 double replaced1[nMissing];
 double replaced2[nMissing];
 printf("debug2\n");

 int missingVectorIdx;
 int missingVariantIdx;
 // for each sample, store the dosage at the missing site (as it could be missing
 // in the OTHER sample), and replace it with 0.0 in the dosage matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  printf("debugA: %d < %d\n",missingVectorIdx,nMissing);
  missingVariantIdx = missing[missingVectorIdx];
  replaced1[missingVariantIdx] = DOSAGES[missingVariantIdx][samI];
  replaced2[missingVariantIdx] = DOSAGES[missingVariantIdx][samJ];
  printf("debugB\n");
  DOSAGES[missingVariantIdx][samI]=0.0;
  DOSAGES[missingVariantIdx][samJ]=0.0;
 }

 // calculate the distance (this uses DOSAGES which we just modified)
 double distance = _calcDistance(samI,samJ);

 printf("debug3\n");
 // put the values that we replaced with 0.0 back into the matrix
 for ( missingVectorIdx = 0; missingVectorIdx < nMissing; missingVectorIdx++ ) {
  missingVariantIdx = missing[missingVectorIdx];
  DOSAGES[missingVariantIdx][samI] = replaced1[missingVariantIdx];
  DOSAGES[missingVariantIdx][samJ] = replaced2[missingVariantIdx];
 }
 printf("debug4: %f\n",distance);
 // grab the python object wrapper and return it
 PyObject * distPy = PyFloat_FromDouble((double)distance);
 printf("debug5\n");
 if ( NULL == distPy )
  printf("and is NULL\n");
 return distPy;

}

通过大量调试语句(如您所见),我已将段错误本地化为 return 语句。也就是说,在 python 浮点对象的实例化之后,但是在从 C 调用 return 和 python 的下一个执行行之间的某个位置(你猜对了,一个 print("debugReturned")。我在 stdout 中看到的是:

debug4: -0.025160
debug5
Segmentation fault

所以 double 不是一个奇怪的值,python 对象是正确创建的,它不是 NULL,但是从 C 返回和在 python 中继续之间存在一些错误。在线资料表明这可能是一个 INCREF/DECREF 问题,但也指出 PyFloat_FromDouble() 和 Py_BuildValue("f",double) 会生成新的引用,因此不需要进行 INCREF。两种选择都会产生相同的结果。虽然我有理由确定我需要在我的 grm_loadDosage 函数期间对持有我的矩阵的 PyObject 进行 INCREF,但我已经尝试了使用和不使用 INCREF 的情况,其行为相同。

有什么想法吗?

谢谢

此外,堆栈跟踪:

#0  0x0000000000000000 in ?? ()
#1  0x000000000045aa5c in PyEval_EvalFrameEx (f=0x2aaae1ae3f60, throwflag=<value optimized out>) at Python/ceval.c:2515
#2  0x000000000045ecb4 in call_function (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:4009
#3  PyEval_EvalFrameEx (f=0x3fb7494970227c55, throwflag=<value optimized out>) at Python/ceval.c:2692
#4  0x000000000045ecb4 in call_function (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:4009
#5  PyEval_EvalFrameEx (f=0x95c880, throwflag=<value optimized out>) at Python/ceval.c:2692
#6  0x000000000045f626 in PyEval_EvalCodeEx (_co=0x98abe0, globals=<value optimized out>, locals=<value optimized out>, args=0x0, argcount=0, kws=0x0, kwcount=0, defs=0x0, defcount=0, kwdefs=0x0,
    closure=0x0) at Python/ceval.c:3350
#7  0x000000000045f74b in PyEval_EvalCode (co=0x146b098, globals=0x71, locals=0xc7) at Python/ceval.c:767
#8  0x0000000000482fab in run_mod (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>,
    globals=0x81e340, locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1783
#9  PyRun_FileExFlags (fp=0x881b80, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", start=<value optimized out>, globals=0x81e340,
    locals=0x81e340, closeit=1, flags=0x7fffffffbfd0) at Python/pythonrun.c:1740
#10 0x0000000000483268 in PyRun_SimpleFileExFlags (fp=<value optimized out>, filename=0x2aaaae257de0 "/humgen/gsa-hphome1/chartl/projects/t2d/gcta/resources/bin/cGRM/calculateGRM.py", closeit=1,
    flags=0x7fffffffbfd0) at Python/pythonrun.c:1265
#11 0x00000000004964d7 in run_file (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:297
#12 Py_Main (argc=<value optimized out>, argv=0x7df010) at Modules/main.c:692
#13 0x000000000041563e in main (argc=11, argv=0x7fffffffc148) at ./Modules/python.c:59
4

1 回答 1

1

我建议尝试对您的代码运行 valgrind,请参阅 如何将 valgrind 与 Python C++ 扩展一起使用? 至于如何用 python 做到这一点,我不确定该 exeoption 列表对 python 3 有多大用处。无论如何,忽略所有来自没有任何文件部分的部分的输出。

如果您使用的是 Windows,我会推荐其中之一。 是否有一个好的 Valgrind 替代 Windows?

您所做的调试告诉我错误发生在您的函数返回后。我不明白为什么 return 语句本身应该有问题。根据您的堆栈跟踪,错误的根源在于 python-3 本身的一些代码。我假设 python-3 本身没有错误。您可以尝试安装不同版本的 python-3 以进一步排除此问题。这就是为什么我假设您损坏了堆栈或堆,这就是 valgrind 派上用场的原因。

于 2012-11-06T08:52:22.903 回答