I have a 140 characters texts and a set of keywords. What I want to do is to write an algorithm that will help me compute a percentage matching between my text and keywords in order to qualify a text as repesenting an IT event annonciation.

For example: Text: "Tomorrow will take place our weekly event which about computer. We will discuss about how to implement algorithms. This will be very great." keyword: "event, computer, database, Software, algorithms"

Here the matching is 3 words over 5 keywords which is 60%

Does that make sense, using word count and compare it to the number of keyword ? Is this approch accurate? Does anyone has dealt with something like this before?

Thanks for your support.


1 回答 1




有大量用于文本分类的算法和库。LingPipe是一个不错的 Java 库,可能会对您有所帮助。

如果您对使用库感兴趣,您可以在此quora question的最佳答案中找到一个很好的概述。

于 2015-12-23T09:54:30.207 回答