python - 'NoneType' 对象不可迭代用于搭配功能

Question

我是 NLTK 的新手，并试图返回搭配输出。我得到了输出，随之而来的是，我也没有得到任何结果。下面是我的代码，输入和输出。

import nltk
from nltk.corpus import stopwords


def performBigramsAndCollocations(textcontent, word):
    stop_words = set(stopwords.words('english'))
    pattern = r'\w+'
    tokenizedwords = nltk.regexp_tokenize(textcontent, pattern)
    for i in range(len(tokenizedwords)):
        tokenizedwords[i] = tokenizedwords[i].lower()
    tokenizedwordsbigrams = nltk.bigrams(tokenizedwords)
    tokenizednonstopwordsbigrams = [ (w1, w2) for w1, w2 in tokenizedwordsbigrams if w1 not in stop_words and w2 not in stop_words]
    cfd_bigrams = nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter = cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocations()
    return mostfrequentwordafter, collocationwords


if __name__ == '__main__':
    textcontent = input()

    word = input()


    mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
    print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
    print(sorted(collocationwords))

输入：在7天的比赛中，将提供35个体育项目和4个文化活动。他带着魅力滑冰，从一个档位换到另一个档位，从一个方向换到另一个方向，比跑车还快。如果不支付电视许可费，安顿下来观看奥运会的扶手椅体育迷可能会跳高。此类邀请赛将激发体育迷的兴趣，从而吸引更多体育迷的收视率。她几乎没有注意到一辆华丽的跑车差点把他们撞倒，直到埃迪向前猛扑过去，一把抓住了她的身体。他奉承母亲，她有点生气，他说服她去骑跑车。

运动的

输出：
跑车；体育迷。

[('fans', 3), ('car', 3), ('disciplines', 1)]

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-191-40624b3de987> in <module>
     43     mostfrequentwordafter, collocationwords = performBigramsAndCollocations(textcontent, word)
     44     print(sorted(mostfrequentwordafter, key=lambda element: (element[1], element[0]), reverse=True))
---> 45     print(sorted(collocationwords))

TypeError: 'NoneType' object is not iterable

你能帮我解决这个问题吗

score 1 · Accepted Answer

collocations() 有问题并导致 nltk 出错。我最近遇到了这个问题，并且能够使用 collocation_list() 解决这个问题。试试这个方法。

collocationwords = tokenizedwords.collocation_list()

score 1 · Accepted Answer

使用下面的代码应该可以工作。

def performBigramsAndCollocations(textcontent, word):
    
    from nltk.corpus import stopwords
    from nltk import ConditionalFreqDist
    tokenizedword = nltk.regexp_tokenize(textcontent, pattern = r'\w*', gaps = False)
    tokenizedwords = [x.lower() for x in tokenizedword if x != '']
    tokenizedwordsbigrams=nltk.bigrams(tokenizedwords)
    stop_words= stopwords.words('english')
    tokenizednonstopwordsbigrams=[(w1,w2) for w1 , w2 in tokenizedwordsbigrams if (w1 not in stop_words and w2 not in stop_words)]
    cfd_bigrams=nltk.ConditionalFreqDist(tokenizednonstopwordsbigrams)
    mostfrequentwordafter=cfd_bigrams[word].most_common(3)
    tokenizedwords = nltk.Text(tokenizedwords)
    collocationwords = tokenizedwords.collocation_list()

    return mostfrequentwordafter ,collocationwords

score 0 · Accepted Answer

collocation_list()单独没有帮助。我尝试了以下方法，它对我有用。

collocationwords1 = tokenizedwords.collocation_list()

collocationwords=list()
for item in collocationwords1:
    newitem=item[0]+" "+item[1]
    collocationwords.append(newitem)

score -1 · Accepted Answer

key 在运行之前转换集合项。key=真的意味着当我浏览这个列表时，我会 - 所以当你使用时，key=lambda element: (element[1], element[0])你要求它运行两次。而是尝试这样的事情。请注意，这可能不完全正确，因为它是早上 7 点，我刚醒来，如果它不适合您，我稍后会编辑它。

mylist = [0,1]
print(sorted(mostfrequentwordafter, key=lambda element: (element[mylist]), reverse=True))

python - 'NoneType' 对象不可迭代用于搭配功能

4 回答 4

Related

Reference