I want to do phishing/spam detection on twitter. I’ve got about 500,000 tweets through Streaming API provided by twitter. Then I extract the url appeared in these tweets and submit them to two blacklists – Google safebrowsing and Phishtanks to receive a basic judge of whether it’s a phishing link or not. The problem here is that according to my experiment results, I can’t get enough samples of phishing tweet. Are there any exsisting tweet data that have already be marked as malicious/normal so that I can carry on with my work?
问问题
152 次