我正在尝试实现朴素贝叶斯算法来对新闻报纸标题进行情感分析。我为此目的使用 TextBlob,我发现很难删除诸如“a”、“the”、“in”等停用词。下面是我在 python 中的代码片段:
from textblob.classifiers import NaiveBayesClassifier
from textblob import TextBlob
test = [
("11 bonded labourers saved from shoe firm", "pos"),
("Scientists greet Abdul Kalam after the successful launch of Agni on May 22, 1989","pos"),
("Heavy Winter Snow Storm Lashes Out In Northeast US", "neg"),
("Apparent Strike On Gaza Tunnels Kills 2 Palestinians", "neg")
]
with open('input.json', 'r') as fp:
cl = NaiveBayesClassifier(fp, format="json")
print(cl.classify("Oil ends year with biggest gain since 2009")) # "pos"
print(cl.classify("25 dead in Baghdad blasts")) # "neg"