python - 在 python 中加入单词列表

Question

我需要从中ngrams提取text。我在用着：

from textblob import TextBlob
text = TextBlob('me king of python')
print(text.ngrams(n=3)

将文本（python 之王）拆分为三元组，它给出：

[WordList(['me', 'king', 'of']), WordList(['king', 'of', 'python'])]

现在我需要将每个 WordList 的项目加入：

x = {word for word in ' '.join(text.ngrams(n=3)) }
print x

它给了我以下错误：

TypeError: sequence item 0: expected string or Unicode, WordList found

我知道解决方案很愚蠢，但我不擅长 python，我不明白wordlists。

score 2 · Accepted Answer

尝试这个：

>>> from textblob import TextBlob
>>> blob = TextBlob('me king of python')
>>> trigram = blob.ngrams(n=3)
>>> for wlist in trigram:
...     print ' '.join(wlist)
me king of
king of python

更好的是，使用 for 循环，因为文本可能有多个WordLists.

更新

使用纯 Python 也可以实现相同的目标。这是一个例子：

>>> def ngrams(s, n=2, i=0):
...     while len(s[i:i+n]) == n:
...             yield s[i:i+n]
...             i += 1
...
>>> grams = ngrams('me king of Python'.split())
>>> list(grams)
[['me', 'king'], ['king', 'of'], ['of', 'Python']]

python - 在 python 中加入单词列表

1 回答 1

更新

Related

Reference