3

I have a (fixed) set of keys for which I store a value. I often look up the value for a key and increment or decrement it. A typical dict usage.

x = {'a': 1, 'b': 4, 'c': 3}
x['a'] += 1

Additionally however, just as often as incrementing or decrementing values, I also need to know the key for the i-th largest (or smallest) value. Of course I can do the sorting:

s = sorted(x, key=lambda k:(x[k],k))
s[1] == 'c'

The problem is sorting every time seems rather expensive. Especially because I only increment one item in between sorts. I feel that I could use another data structure better suited for this. A tree perhaps?

4

5 回答 5

2

您可以使用 blist 的 sorteddict 来保持值的顺序。这是一个字典的快速实现,当迭代时,它按值的顺序返回它的键(没有真正深入测试):

import collections
from blist import sorteddict

class ValueSortedDict(collections.MutableMapping):
    def __init__(self, data):
        self._dict = {}
        self._sorted = sorteddict()
        self.update(data)

    def __getitem__(self, key):
        return self._dict[key]

    def __setitem__(self, key, value):
        # remove old value from sorted dictionary
        if key in self._dict:
            self.__delitem__(key)
        # update structure with new value
        self._dict[key] = value
        try:
            keys = self._sorted[value]
        except KeyError:
            self._sorted[value] = set([key])
        else:
            keys.add(key)            

    def __delitem__(self, key):
        value = self._dict.pop(key)
        keys = self._sorted[value]
        keys.remove(key)
        if not keys:
            del self._sorted[value]

    def __iter__(self):
        for value, keys in self._sorted.items():
            for key in keys:
                yield key

    def __len__(self):
        return len(self._dict)

x = ValueSortedDict(dict(a=1, b=4, c=3))
x['a'] += 1
print list(x.items())
x['a'] += 10
print list(x.items())
x['d'] = 4
print list(x.items())

这给出了:

[('a', 2), ('c', 3), ('b', 4)]
[('c', 3), ('b', 4), ('a', 12)]
[('c', 3), ('b', 4), ('d', 4), ('a', 12)]
于 2013-10-17T15:43:01.140 回答
1

您可以使用OrderDict来自collections. 虽然它在旧的 python 版本中不可用。

from collections import OrderedDict

如果你安装了 django,你可以使用django.utils.datastructures.SortedDict

于 2013-10-17T15:53:16.210 回答
0

使用运算符:

import operator

max(x.iteritems(), key=operator.itemgetter(1))[0]

从文档:

operator.itemgetter(*items)

返回一个可调用对象,该对象使用操作数的getitem () 方法从其操作数中获取项目。如果指定了多个项目,则返回查找值的元组。例如:

我不知道这是否是最好的解决方案,但它确实有效。

于 2013-10-17T14:49:41.460 回答
0

为什么不使用Counterfrom collections?然后您可以使用Counter.most_common()来获取排序列表。

>>> from collections import Counter
>>> x = Counter({'a': 1, 'b': 4, 'c': 3})
>>> x['a'] += 1
>>> x.most_common()
[('b', 4), ('c', 3), ('a', 2)]
于 2013-10-17T14:50:32.420 回答
0

我认为大多数 python 结构都会做一些类似于你在示例中所做的事情。我唯一能想到的让它更有效率的事情就是保留一个排序的键列表。这样,您只需在每次插入时进行排序。在您的方法中,每次要按索引访问值时都必须进行排序。这是一个例子:

x = {'a': 1, 'b': 4, 'c': 3}
x['a'] += 1

keyList = sorted(x.keys())

print x[keyList[1]]
4

x['e'] = 7
x['j'] = 11
x['d'] = 6
x['h'] = 8

keyList = sorted(x.keys())

print x[keyList[3]]
6
print x[keyList[4]]
7

希望有帮助。

于 2013-10-17T18:19:24.310 回答