1

axis 0IndexError我觉得很奇怪。我的错误在哪里?

如果我在设置 MultiIndex 之前不重命名列,它会起作用(取消注释行df = df.set_index([0, 1])并注释上面的三个)。使用稳定版和开发版测试。

我对 python 和 pandas 相当陌生,因此非常感谢任何其他改进建议。

import itertools
import datetime as dt

import numpy as np
import pandas as pd
from pandas.io.html import read_html


dfs = read_html('http://www.epexspot.com/en/market-data/auction/auction-table/2006-01-01/DE',
                attrs={'class': 'list hours responsive'},
                skiprows=1)

df = dfs[0]

hours = list(itertools.chain.from_iterable([[x, x] for x in range(1, 25)]))
df[0] = hours

df = df.rename(columns={0: 'a'})
df = df.rename(columns={1: 'b'})
df = df.set_index(['a', 'b'])
#df = df.set_index([0, 1])

today = dt.datetime(2006, 1, 1)
days = pd.date_range(today, periods=len(df.columns), freq='D')

colnames = [day.strftime(format='%Y-%m-%d') for day in days]
df.columns = colnames


Traceback (most recent call last):
  File "<console>", line 1, in <module>
  File "/Users/user/Optional/pandas_stable_env/lib/python3.3/site-packages/pandas/core/frame.py", line 2099, in __setattr__
    super(DataFrame, self).__setattr__(name, value)
  File "properties.pyx", line 59, in pandas.lib.AxisProperty.__set__ (pandas/lib.c:29330)
  File "/Users/user/Optional/pandas_stable_env/lib/python3.3/site-packages/pandas/core/generic.py", line 656, in _set_axis
    self._data.set_axis(axis, labels)
  File "/Users/user/Optional/pandas_stable_env/lib/python3.3/site-packages/pandas/core/internals.py", line 1039, in set_axis
    block.set_ref_items(self.items, maybe_rename=maybe_rename)
  File "/Users/user/Optional/pandas_stable_env/lib/python3.3/site-packages/pandas/core/internals.py", line 93, in set_ref_items
    self.items = ref_items.take(self.ref_locs)
  File "/Users/user/Optional/pandas_stable_env/lib/python3.3/site-packages/pandas/core/index.py", line 395, in take
    taken = self.view(np.ndarray).take(indexer)
IndexError: index 7 is out of bounds for axis 0 with size 7
4

1 回答 1

1

这是一个非常微妙的错误。在即将发布的 0.13 版(很快)中将通过以下方式修复:https ://github.com/pydata/pandas/pull/5345。

作为一种解决方法,您可以在此之后set_index但在列分配之前执行此操作

df = DataFrame(dict([ (c,col) for c, col in df.iteritems() ]))

框架的内部状态为关闭;是重命名后跟 set_index 导致了这种情况,所以这会重新创建它,以便您可以使用它。

于 2013-10-27T00:38:39.190 回答