0

predictions我有由三列组成的panada 数据框。我使用三个创建了这个数据框memmap array

    predictions = pd.dataframe{'cell': list_1, 'tree': list_2, 'predict': list_3, 'label': list_4}

现在我想在这个数据框的两列上进行分组,在第三列上取平均值,如下所示:

    df = predictions.groupby(['tree', 'cell'])['list3'].mean()

但它给了我一个错误,说 memmap 数组是不可散列的!它不能执行groupby。我真的需要这样做,groupby否则我必须做两个for循环,这需要永远,因为我的字典有1,000,000行。我想知道有人知道解决方案吗?谢谢

Edited celltreecolumns 是来自 的项目列表memmap arraypredict并且label只是普通列表。项目列表memmap array如下所示:单元格

[memmap([415], dtype=int32), 
memmap([143], dtype=int32), 
memmap([96],  dtype=int32), 
memmap([432], dtype=int32), 
memmap([104], dtype=int32), 
memmap([76], dtype=int32), 
memmap([312], dtype=int32), 
memmap([143], dtype=int32), 
memmap([312], dtype=int32), 
memmap([64], dtype=int32),
memmap([296], dtype=int32)]

预测数据框如下所示:

      cell  label  predict  tree
0    [415]      0        1  [19]
1    [143]      1        1  [22]
2     [96]      0        1  [19]
3    [432]      1        1  [12]
4    [104]      0        1  [21]
5     [76]      0        1  [19]
6    [312]      1        1  [22]
7    [143]      1        1  [22]
8    [312]      1        1  [22]
9     [64]      0        1  [18]
10   [296]      1        1  [22]

我收到以下错误:

predictions_target = predictions.groupby(['tree', 'cell'])    ['predict'].mean()
File "/usr/venv/local/lib/python2.7/site-packages/pandas    /core/groupby.py", line 1015, in mean
return self._python_agg_general(f)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 826, in _python_agg_general
return self._python_apply_general(f)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 698, in _python_apply_general
self.axis)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1577, in apply
splitter = self._get_splitter(data, axis=axis)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1563, in _get_splitter
comp_ids, _, ngroups = self.group_info
File "pandas/src/properties.pyx", line 34, in pandas.lib.cache_readonly.__get__ (pandas/lib.c:44222)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1670, in group_info
comp_ids, obs_group_ids = self._get_compressed_labels()
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 1677, in _get_compressed_labels
all_labels = [ping.labels for ping in self.groupings]
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 2308, in labels
self._musr/venv/local/lib/python2.7/site-packages/pandas/core/groupby.py", line 2319, in _make_labels
labels, uniques = algos.factorize(self.grouper, sort=self.sort)
File "/usr/venv/local/lib/python2.7/site-packages/pandas/core/algorithms.py", line 313, in factorize
labels = table.get_labels(vals, uniques, 0, na_sentinel, True)
File "pandas/src/hashtable_class_helper.pxi", line 843, in     pandas.hashtable.PyObjectHashTable.get_labels (pandas/hashtable.c:14831)
TypeError: unhashable type: 'memmap'
4

0 回答 0