我正在尝试对客户反馈进行分类,并在 python 中运行了一个 LDA,并获得了 10 个主题的以下输出:
(0, u'0.559*"delivery" + 0.124*"area" + 0.018*"mile" + 0.016*"option" + 0.012*"partner" + 0.011*"traffic" + 0.011*"hub" + 0.011*"thanks" + 0.010*"city" + 0.009*"way"')
(1, u'0.397*"package" + 0.073*"address" + 0.055*"time" + 0.047*"customer" + 0.045*"apartment" + 0.037*"delivery" + 0.031*"number" + 0.026*"item" + 0.021*"support" + 0.018*"door"')
(2, u'0.190*"time" + 0.127*"order" + 0.113*"minute" + 0.075*"pickup" + 0.074*"restaurant" + 0.031*"food" + 0.027*"support" + 0.027*"delivery" + 0.026*"pick" + 0.018*"min"')
(3, u'0.072*"code" + 0.067*"gps" + 0.053*"map" + 0.050*"street" + 0.047*"building" + 0.043*"address" + 0.042*"navigation" + 0.039*"access" + 0.035*"point" + 0.028*"gate"')
(4, u'0.434*"hour" + 0.068*"time" + 0.034*"min" + 0.032*"amount" + 0.024*"pay" + 0.019*"gas" + 0.018*"road" + 0.017*"today" + 0.016*"traffic" + 0.014*"load"')
(5, u'0.245*"route" + 0.154*"warehouse" + 0.043*"minute" + 0.039*"need" + 0.039*"today" + 0.026*"box" + 0.025*"facility" + 0.025*"bag" + 0.022*"end" + 0.020*"manager"')
(6, u'0.371*"location" + 0.110*"pick" + 0.097*"system" + 0.040*"im" + 0.038*"employee" + 0.022*"evening" + 0.018*"issue" + 0.015*"request" + 0.014*"while" + 0.013*"delivers"')
(7, u'0.182*"schedule" + 0.181*"please" + 0.059*"morning" + 0.050*"application" + 0.040*"payment" + 0.026*"change" + 0.025*"advance" + 0.025*"slot" + 0.020*"date" + 0.020*"tomorrow"')
(8, u'0.138*"stop" + 0.110*"work" + 0.062*"name" + 0.055*"account" + 0.046*"home" + 0.043*"guy" + 0.030*"address" + 0.026*"city" + 0.025*"everything" + 0.025*"feature"')
有没有办法自动标记它们?我确实有一个 csv 文件,其中手动标记了反馈,但我不想自己提供这些标签。我希望模型创建标签。可能吗?