7

我已经开发了一个使用 python/cython 对 CSV 文件进行排序并为客户端生成统计信息的实用程序,但是在我的映射函数有机会执行之前调用 pool.map 似乎会引发异常。对少量文件进行排序似乎可以按预期运行,但是随着文件数量增长到 10 个,我在调用 pool.map 后得到以下 IndexError。有没有人碰巧认识到以下错误?任何帮助是极大的赞赏。

虽然代码在 NDA 下,但用例相当简单:

代码示例:

def sort_files(csv_files):
    pool_size = multiprocessing.cpu_count()
    pool = multiprocessing.Pool(processes=pool_size)
    sorted_dicts = pool.map(sort_file, csv_files, 1)
    return sorted_dicts

def sort_file(csv_file):
    print 'sorting %s...' % csv_file
    # sort code

输出:

File "generic.pyx", line 17, in generic.sort_files (/users/cyounker/.pyxbld/temp.linux-x86_64-2.7/pyrex/generic.c:1723)
    sorted_dicts = pool.map(sort_file, csv_files, 1)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 227, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 528, in get
    raise self._value
IndexError: list index out of range
4

2 回答 2

18

IndexError 是您在sort_file() 中的某个地方(即在子进程中)出现的错误。它由父进程重新引发。显然multiprocessing没有试图告诉我们错误真正来自哪里(例如它发生在哪一行),甚至没有告诉我们 sort_file() 的哪个参数导致它。我multiprocessing更讨厌:-(

于 2012-12-21T22:33:47.800 回答
3

在命令输出中进一步检查。至少在 Python 3.4 中,multiprocessing.pool将有助于RemoteTraceback在父进程回溯上方打印一个。你会看到类似的东西:

multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/usr/lib/python3.4/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/path/to/your/code/here.py", line 80, in sort_file
    something = row[index]
IndexError: list index out of range
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "generic.pyx", line 17, in generic.sort_files (/users/cyounker/.pyxbld/temp.linux-x86_64-2.7/pyrex/generic.c:1723)
    sorted_dicts = pool.map(sort_file, csv_files, 1)
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 227, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/usr/lib64/python2.7/multiprocessing/pool.py", line 528, in get
    raise self._value
IndexError: list index out of range

在上述情况下,引发错误的代码位于/path/to/your/code/here.py", line 80

另请参阅调试 python 多处理中的错误

于 2016-08-09T17:17:09.923 回答