python - Python： Range() 作为字典值

Question

假设我有一个像下面这样的字典，其中的值是每个键出现在文本中的概率。

   dict = {'a':0.66,'b':0.07,'c':0.04 and so on so the values of the dict sum up to one}

说我想构建另一个具有这些值范围的字典具有价值。由于我们不能将 range() 与浮点数一起使用，我尝试首先将所有值乘以 100，因此它们变成了 int。假设我们想用它们的范围替换这些值。所以例如'a'会得到一个范围（0,66），'b'范围（66,73），'c'（73,77）等。我试图用下面的循环来做到这一点，但它没有工作：

start = 0
end = 0
for k,v in dict.items():
   end+=int(v*100)
   range_dict[k]=range(start,end)
   start+=end

有人能帮帮我吗？我要疯了，想知道该怎么做！

score 3 · Accepted Answer

如果你改变

start += end

到

start = end

它应该可以工作（xrange在此处使用以使其更明显）：

>>> d = {'a':0.66,'b':0.07,'c':0.04}
>>> start = 0
>>> end = 0
>>> range_dict = {}
>>> for k,v in d.items():
...    end+=int(v*100)
...    range_dict[k]=xrange(start,end)
...    start=end
... 
>>> range_dict
{'a': xrange(66), 'c': xrange(66, 70), 'b': xrange(70, 77)}

但是，如果@Satoru.Logic 猜测您想要一个加权随机数，那么还有更好的方法。Eli Bendersky 在此处对 Python 中的方法进行了很好的概述。

score 0 · Accepted Answer

自豪地从 Python 3.3.0 文档中窃取：

随机 - 9.6.2。示例和食谱- 包含加权分布算法。
itertools.accumulate - 包含累积算法。

下面的代码是为 2.X 编写的：

import random
import bisect

D = {'a':0.66,'b':0.07,'c':0.04,'d':0.20,'e':0.03}

# This function is in Python 3.2+ itertools module.
def accumulate(iterable):
    'Return running totals'
    # accumulate([1,2,3,4,5]) --> 1 3 6 10 15
    it = iter(iterable)
    total = next(it)
    yield total
    for element in it:
        total = total + element
        yield total

# Extract the weights and build a cumulative distribution.
choices, weights = zip(*D.items())
cumdist = list(accumulate(weights))

# Make 1000 random selections
L = [choices[bisect.bisect(cumdist, random.random() * cumdist[-1])]
     for _ in xrange(1000)]

# Display the results
for c in sorted(D.keys()):
    print '{} {:3d}'.format(c,L.count(c))

输出：

python - Python： Range() 作为字典值

2 回答 2

Related

Reference