尽管 pandas 允许您使用读取列df.Col
,但这显然只是 的简写df['Col']
,并且该简写不适用于创建新列。你需要做mydf['D'] = 4
。
我觉得这很不幸,因为我经常尝试像你那样做。阴险的部分是它实际上创建了一个在数据框对象上调用的普通 Python 属性D
。它实际上并没有作为列添加。因此,您必须确保删除该属性,否则即使您稍后正确添加它,它也会隐藏该列:
>>> d = pandas.DataFrame(np.random.randn(3, 2), columns=["A", "B"])
>>> d
A B
0 -0.931675 1.029137
1 -0.363033 -0.227672
2 0.058903 -0.362436
>>> d.Col = 8
>>> d.Col # Attribute is there
8
>>> d['Col'] # But it is not a columns, just a simple attribute
Traceback (most recent call last):
File "<pyshell#8>", line 1, in <module>
d['Col']
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\frame.py", line 1906, in __getitem__
return self._get_item_cache(key)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\generic.py", line 570, in _get_item_cache
values = self._data.get(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1383, in get
_, block = self._find_block(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1525, in _find_block
self._check_have(item)
File "c:\users\brenbarn\documents\python\extensions\pandas\pandas\core\internals.py", line 1532, in _check_have
raise KeyError('no item named %s' % com.pprint_thing(item))
KeyError: u'no item named Col'
>>> d['Col'] = 100 # Create a real column
>>> d.Col # Attribute blocks access to column
8
>>> d['Col'] # Column is available via item access
0 100
1 100
2 100
Name: Col, dtype: int64
>>> del d.Col # Delete the attribute
>>> d.Col # Columns is now available as an attribute (!)
0 100
1 100
2 100
Name: Col, dtype: int64
>>> d['Col'] # And still as an item
5: 0 100
1 100
2 100
Name: Col, dtype: int64
看到d.Col
“只有在你删除它之后才有效”可能有点令人惊讶——也就是说,在你删除它之后del d.Col
,随后的阅读d.Col
实际上会给你这个专栏。这只是因为 Python__getattr__
的工作方式,但在这种情况下仍然有点不直观。