0

I'm trying to create a nested dictionary with the following format:

{person1:
         {tweet1 that person1 wrote: times that tweet was retweeted},
         {tweet2 that person1 wrote: times that tweet was retweeted},
 person2:
         {tweet1 that person2 wrote: times that tweet was retweeted},...
 }

I'm trying to create it from the following data structures. The following are truncated versions of the real ones.

 rt_sources =[u'SaleskyKATU', u'johnfaye', u'@anisabartes']
 retweets = [[], 
  [u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT',u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT', u'Stay safe #nyc #sandy http://t.co/TisObxxT'], []]
 annotated_retweets = {u'Stay safe #nyc #sandy http://t.co/TisObxxT':26}
 ''' 
     Key is a tweet from set(retweets) 
     Value is how frequency of each key in retweets
 '''

 for_Nick = {person:dict(tweet_record,[annotated_tweets[tr] for tr in tweet_record]) 
                                    for person,tweet_record in zip(rt_sources,retweets)}

Neither this SO question nor this one seem to apply.

4

3 回答 3

1

似乎“人”和“推文”将成为具有自己的数据功能的对象。您可以通过将事物包装在一个类中来逻辑地关联这个想法。例如:

class tweet(object):
    def __init__(self, text):
        self.text = text
        self.retweets = 0
    def retweet(self):
        self.retweets += 1
    def __repr__(self):
        return "(%i)" % (self.retweets)
    def __hash__(self):
        return hash(self.text)

class person(object):
    def __init__(self, name):
        self.name = name
        self.tweets = dict()

    def __repr__(self):
        return "%s : %s" % (self.name, self.tweets)

    def new_tweet(self, text):
        self.tweets[text] = tweet(text)

    def retweet(self, text):
        self.tweets[text].retweet()

M = person("mac389")
M.new_tweet('foo')
M.new_tweet('bar')
M.retweet('foo')
M.retweet('foo')

print M

会给:

mac389 : {'foo': (2), 'bar': (0)}

这里的优势是双重的。一是与人或推文相关的新数据以明显且合乎逻辑的方式添加。第二个是您创建了一个不错的用户界面(即使您是唯一使用它的人!),从长远来看,这将使生活更轻松。

于 2012-11-27T15:12:46.527 回答
0

Guido 说,显式优于隐式

for_Nick = {}
for person,tweets in zip(rt_sources,retweets):
     if person not in for_Nick:
          for_Nick[person] = {}
          for tweet in list(set(tweets)):
               frequency = annotated_retweets[tweet]
               for_Nick[person][tweet] = frequency
     else: #Somehow person already in dictionary <-- Shouldn't happen
         for tweet in tweets:
             if tweet in for_Nick[person]:
                  current_frequency = for_Nick[person][tweet]
                  incoming_frequency = annotated_retweets[tweet]
                  for_Nick[person][tweet] = current_frequency + incoming_frequency
             else: #Person is already there but he said something new
                frequency = annotated_retweets[tweet]
                for_Nick[person][tweet] = frequency

也许还有更优雅的形式。

于 2012-11-27T14:58:31.727 回答
0

这可能是您试图构建的 dict 理解:

for_Nick = {person: 
               {tr: annotated_retweets[tr]
                for tr in set(tweet_record)} 
            for person, tweet_record in zip(rt_sources,retweets)}

您尝试将键列表和值列表传递给dict构造函数,而构造函数需要键值对列表(或其他可迭代)。

于 2012-11-27T21:02:59.077 回答