twitter - Training data for phishing or spam tweets

Question

I want to do phishing/spam detection on twitter. I’ve got about 500,000 tweets through Streaming API provided by twitter. Then I extract the url appeared in these tweets and submit them to two blacklists – Google safebrowsing and Phishtanks to receive a basic judge of whether it’s a phishing link or not. The problem here is that according to my experiment results, I can’t get enough samples of phishing tweet. Are there any exsisting tweet data that have already be marked as malicious/normal so that I can carry on with my work?

score 0 · Accepted Answer

url 黑名单效果不佳，因为存在延迟。您可以使用已暂停的帐户作为标签，但请注意，并非所有已暂停的帐户都是钓鱼帐户。

twitter - Training data for phishing or spam tweets

1 回答 1

Related

Reference