0

我试图在 24 小时内检索某个关键字的推文数量的单个整数。所以说关键字是“交通”,我想统计过去 24 条中带有“交通”这个词的推文的数量,并将其存储为一个数字,用于生成其他内容。

现在我可以使用 query.setCount 提供一个特定数字并在过去 24 小时内检索任意数字(1024)条推文,但我无法判断这是否是 24 小时内的所有推文,我真正想要的是号码,我不需要推文的实际文本或其他信息。此外,随着新推文的出现,请更新该号码。

我怎么能这样做呢?

到目前为止,这是我的 getNewTweets 方法:

    void getNewTweets(){
    SimpleDateFormat sdf = new SimpleDateFormat("y-M-d");

  Calendar calendar = Calendar.getInstance();
calendar.add(Calendar.HOUR_OF_DAY, -24);

  String yesterday = sdf.format(calendar.getTime());

Query query = new Query("traffic"); 
  query.setSince(yesterday);
  int numberOfTweets = 1024;
  long lastID = Long.MAX_VALUE;
  while (tweets.size () < numberOfTweets) {
    if (numberOfTweets - tweets.size() > 100)
      query.setCount(100);
    else 
      query.setCount(numberOfTweets - tweets.size());
    try {
      QueryResult result = twitter.search(query);
      tweets.addAll(result.getTweets());
      println("Gathered " + tweets.size() + " tweets");
      for (Status t: tweets) 
        if(t.getId() < lastID) lastID = t.getId();

    }

    catch (TwitterException te) {
      println("Couldn't connect: " + te);
    }; 
    query.setMaxId(lastID-1);
  }

}
4

2 回答 2

0

That said (@mbaxi answer) I think that for a not really popular word the Stream API would be suitable for that task. I'm running this code for 5 minutes using the very popular "love" and got no warnings so far, also got about 25000 tweets in love... I made this very simple and not precise timer just for the example sake... Although you said you don't want the text, it's is being printed to console...

Here an example

import twitter4j.util.*;
import twitter4j.*;
import twitter4j.management.*;
import twitter4j.api.*;
import twitter4j.conf.*;
import twitter4j.json.*;
import twitter4j.auth.*;
int startTime;
int tweetNumber;
PFont f ;
String theWord = "love";


TwitterStream twitterStream;

void setup() {     
  size(800, 100);    
  background(0); 
  f  = createFont("SourceCodePro-Regular", 25);
  textFont(f);
  openTwitterStream();
  startTime = minute();
}  


void draw() {     
  background(0);
  int passedTime = minute() - startTime;
  text("Received " + nf(tweetNumber, 5) + " tweets with the word: " + theWord, 30, height - 50); 
  text("in last " +  nf(passedTime, 3) + " minutes", 30, height - 25);
}  



// Stream it
void openTwitterStream() {  

  ConfigurationBuilder cb = new ConfigurationBuilder();  
  cb.setOAuthConsumerKey("-----FILL-----");
  cb.setOAuthConsumerSecret("-----FILL-----");
  cb.setOAuthAccessToken("-----FILL-----");
  cb.setOAuthAccessTokenSecret("-----FILL-----"); 

  TwitterStream twitterStream = new TwitterStreamFactory(cb.build()).getInstance();

  FilterQuery filtered = new FilterQuery();

  // if you enter keywords here it will filter, otherwise it will sample
  String keywords[] = {
    theWord
  };

  filtered.track(keywords);

  twitterStream.addListener(listener);

  if (keywords.length==0) {
    // sample() method internally creates a thread which manipulates TwitterStream 
    twitterStream.sample(); // and calls these adequate listener methods continuously.
  } else { 
    twitterStream.filter(filtered);
  }
  println("connected");
} 


// Implementing StatusListener interface
StatusListener listener = new StatusListener() {

  //@Override
  public void onStatus(Status status) {
    tweetNumber++;
    System.out.println("@" + status.getUser().getScreenName() + " - " + status.getText());
  }

  //@Override
  public void onDeletionNotice(StatusDeletionNotice statusDeletionNotice) {
    System.out.println("Got a status deletion notice id:" + statusDeletionNotice.getStatusId());
  }

  //@Override
  public void onTrackLimitationNotice(int numberOfLimitedStatuses) {
    System.out.println("Got track limitation notice:" + numberOfLimitedStatuses);
  }

  //@Override
  public void onScrubGeo(long userId, long upToStatusId) {
    System.out.println("Got scrub_geo event userId:" + userId + " upToStatusId:" + upToStatusId);
  }

  //@Override
  public void onStallWarning(StallWarning warning) {
    System.out.println("Got stall warning:" + warning);
  }

  //@Override
  public void onException(Exception ex) {
    ex.printStackTrace();
  }
};
于 2014-10-18T16:19:54.667 回答
0

您无法确定特定过滤器/搜索查询的推文的确切数量,这两个 API 都受到速率限制。您必须使用 firehose 来获取所有推文数据,并且是付费的。

以下是推特开发者的摘录 -

Before getting involved, it’s important to know that the Search API is focused on relevance and not completeness. This means that some Tweets and users may be missing from search results. If you want to match for completeness you should consider using a Streaming API instead

请阅读以下链接以进一步了解 Streaming API 的速率限制 - https://twittercommunity.com/t/how-much-data-returned-when-using-streaming-api/8407

于 2014-10-18T08:08:16.217 回答