0

在此处输入图像描述 ,下面的代码应遍历推文数据集-文本列,如果单词不在停用词列表中,则应更正拼写,词形还原,然后词干。它不能正常工作你能帮我解决它吗?请检查附图中的错误

pstem = PorterStemmer()
lem = WordNetLemmatizer()
spell = SpellChecker()
stop_words = stopwords.words('english')

for i in range(len(df.index)):
    text = df.loc[i]['text']
    tokens = nltk.word_tokenize(text)
    tokens = [word for word in tokens if word not in stop_words] 
    for j in range(len(tokens)):
        tokens[j] = spell.correction(tokens[j])
        tokens[j] = lem.lemmatize(tokens[j])
        tokens[j] = pstem.stem(tokens[j])
    tokens_sent=' '.join(tokens)
    df.at[i,"text"] = tokens_sent 
4

0 回答 0