我不知道如何清理和矢量化数据。
train=pd.read_csv('longilati.csv',encoding='mac_roman')
train`
Index(['Comment ', 'Polarity'], dtype='object')
以下数据在我的数据框中:
但是,每当我尝试使用以下代码清理数据时
def remove_pattern(text,pattern):
r = re.findall(pattern,text)
for i in r:
text = re.sub(i,"",text)
return text
train['Tidy'] = np.vectorize(remove_pattern)(train['Comment'],"@[\w]*")
train
我收到此错误KeyError: 'Comment'
这是它的完整堆栈跟踪
KeyError Traceback (most recent call last)
F:\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self,
key, method, tolerance)
2645 try:
-> 2646 return self._engine.get_loc(key)
2647 except KeyError:
F:\Anaconda\lib\site-packages\pandas\core\indexes\base.py in get_loc(self,
key, method, tolerance)
2646 return self._engine.get_loc(key)
2647 except KeyError:
-> 2648 return
self._engine.get_loc(self._maybe_cast_indexer(key))
2649 indexer = self.get_indexer([key], method=method,
tolerance=tolerance)
2650 if indexer.ndim > 1 or indexer.size > 1:
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
pandas\_libs\hashtable_class_helper.pxi in
pandas._libs.hashtable.PyObjectHashTable.get_item()
KeyError: 'Comment'