0

代码摘自 Natural Language Processing with Python,第 119 页。 Brown Corpus 不同部分的模态频率。我的问题是它无法像书中描述的那样进行制表。基本上我不知道为什么会这样。我的 Python 版本是 3.7.9 64 位。所有扩展都很顺利。

布朗语料库不同部分的模态频率

def tabulate(cfdist, words, categories):
    print('%-16s' % 'Category')
    for word in words:                  # column headings
        print('%6s' % word,)
    print()
    for category in categories:
        print('%-16s' % category,)      # row headings
        for word in words:              # for each word
            print('%6d' % cfdist[category][word])   # print table cell
        print()                         # end the row

cfd = nltk.ConditionalFreqDist(
        (genre, word)
        for genre in brown.categories()
        for word in brown.words(categories=genre))
genres = ['news', 'religion', 'hobbies', 'science_fiction', 'romance', 'humor']
modals = ['can', 'could', 'may', 'might', 'must', 'will']
tabulate(cfd, modals, genres)
4

0 回答 0