我使用带有英文模型的 spacy 2.0。找到名词和“非名词”来解析输入,然后我将非名词和名词放在一起以创建所需的输出。
您的输入:
s = ["thai iced tea",
"spicy fried chicken",
"sweet chili pork",
"thai chicken curry",]
空间解决方案:
import spacy
nlp = spacy.load('en') # import spacy, load model
def noun_notnoun(phrase):
doc = nlp(phrase) # create spacy object
token_not_noun = []
notnoun_noun_list = []
for item in doc:
if item.pos_ != "NOUN": # separate nouns and not nouns
token_not_noun.append(item.text)
if item.pos_ == "NOUN":
noun = item.text
for notnoun in token_not_noun:
notnoun_noun_list.append(notnoun + " " + noun)
return notnoun_noun_list
调用函数:
for phrase in s:
print(noun_notnoun(phrase))
结果:
['thai tea', 'iced tea']
['spicy chicken', 'fried chicken']
['sweet pork', 'chili pork']
['thai chicken', 'curry chicken']