I have a large collection of articles, 80.000 and I want to extract those that are about one topic. Is there a python library or script in which i can input a manually chosen sample of articles about say Topic A then it would extract from the archive those articles about topic A by comparing the word used and their frequencies.
I have read about the Dunning method, but is there a ready script that I can use preferably python.
Thanks