python - NLTK：TypeError：必须是 str，而不是列表

Question

我在码头集装箱中使用报纸3k。我下载了所有需要的 nltk 数据，但是当我运行article.nlp()thenarticle.nlp()和article.summary.

当我在 Flask 应用程序中使用相同的代码时，它可以工作，现在我正在 Django (+ DRF) 上对其进行测试，但我遇到了这个错误：

web_1  |   File "/usr/local/lib/python3.6/site-packages/newspaper/article.py", line 361, in nlp
web_1  |     summary_sents = nlp.summarize(title=self.title, text=self.text, max_sents=max_sents)
web_1  |   File "/usr/local/lib/python3.6/site-packages/newspaper/nlp.py", line 45, in summarize
web_1  |     sentences = split_sentences(text)
web_1  |   File "/usr/local/lib/python3.6/site-packages/newspaper/nlp.py", line 157, in split_sentences
web_1  |     tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
web_1  |   File "/usr/local/lib/python3.6/site-packages/nltk/data.py", line 752, in load
web_1  |     opened_resource = _open(resource_url)
web_1  |   File "/usr/local/lib/python3.6/site-packages/nltk/data.py", line 877, in _open
web_1  |     return find(path_, path + [""]).open()
web_1  | TypeError: must be str, not list

似乎发现有问题tokenizers/punkt/english.pickle，但是当我检查 nltk_data 时，它就在那里。

你有什么想法，这可能来自哪里？

更新：

代码非常简单。这是我的 Django 视图：

from newspaper import Article

article = Article(url, language=LANG)
article.download()
article.parse()
article.nlp() <---- The problem happens here most probably
article.summary

由于我使用的是 Django Rest 框架，因此我正在使用此字段进行序列化：

summary = serializers.CharField(max_length=5000, required=False)

python - NLTK：TypeError：必须是 str，而不是列表

0 回答 0

Related

Reference