0

我只需要从输入的句子中打印 'NN' 和 'VB' 单词。

import nltk
import re
import time

var = raw_input("Please enter something: ")


exampleArray = [var]


def processLanguage():
    try:
        for item in exampleArray:
            tokenized = nltk.word_tokenize(item)
            tagged = nltk.pos_tag(tokenized)
            print tagged

            time.sleep(555)


    except Exception, e:
        print str(e)

processLanguage()
4

3 回答 3

5

怎么改

    print tagged

    print [(word, tag) for word, tag in tagged if tag in ('NN', 'VB')]
于 2015-07-04T13:25:14.127 回答
1

您可能需要使用 POS 标签的前 2 个字符,请参阅NLTK - Get and Simplify List of Tags

nn_vb_tagged = [(word,tag) for word, tag in tagged 
                if tag.startswith('NN') or tag.startswith('VB')]
于 2015-07-05T19:12:43.863 回答
1

你可以试试这个:

example = "This is a sample sentence, showing off the stop words filtration.!"
word_tokens = word_tokenize(example)
pos = nltk.pos_tag(word_tokens)
selective_pos = ['NN','VB']
selective_pos_words = []
for word,tag in pos:
     if tag in selective_pos:
         selective_pos_words.append((word,tag))
print(selective_pos_words)

通过在列表“selective_pos”中添加您的选择性词性,您可以选择任何您喜欢的词。

于 2019-10-06T10:42:52.083 回答