python - scikit-learn CountVectorizer 中的 TypeError

Question

我正在尝试使用 scikit-learn 进行一些文本分析。但是，当我尝试调用 CountVectorizer 时，会引发错误。示例代码和引发的错误如下：

    >>> from sklearn.feature_extraction.text import CountVectorizer
    >>> corpus = [  'This is the first document.', 'This is the second second document.',  'And    the third one.',  'Is this the first document?', ]
    >>> vectorizer = CountVectorizer(min_df=1)
    >>> X = vectorizer.fit_transform(corpus)
    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    File "/Library/Python/2.6/site-packages/sklearn/feature_extraction/text.py", line 789, in fit_transform
    vocabulary, X = self._count_vocab(raw_documents, self.fixed_vocabulary)
    File "/Library/Python/2.6/site-packages/sklearn/feature_extraction/text.py", line 716, in _count_vocab
    vocabulary = defaultdict(None)
    TypeError: first argument must be callable

这是我安装的错误还是什么？其他示例运行良好。

score 1 · Accepted Answer

总结评论中的讨论：这是 Python 2.6.1 中的一个错误，已修复 Python 2.6 的更新版本（以及后来的 2.7+、3.2+...）。

python - scikit-learn CountVectorizer 中的 TypeError

1 回答 1

Related

Reference