Find centralized, trusted content and collaborate around the technologies you use most.
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
我在 python 中使用朴素贝叶斯分类器进行文本分类。是否有任何平滑方法可以避免 python NLTK 中看不见的单词出现零概率?提前致谢!
我建议将所有频率低(特别是 1)的单词替换为<unseen>,然后在此数据中训练分类器。对于分类,您应该查询模型以查找<unseen>不在训练数据中的单词。
<unseen>