创建一个总和为 X(例如 X=1000)的随机向量非常简单:
import random
def RunFloat():
Scalar = 1000
VectorSize = 30
RandomVector = [random.random() for i in range(VectorSize)]
RandomVectorSum = sum(RandomVector)
RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
return RandomVector
RunFloat()
上面的代码创建了一个向量,其值为浮点数,总和为 1000。
我很难创建一个简单的函数来创建一个向量,其值为整数且总和为 X(例如 X=1000*30)
import random
def RunInt():
LowerBound = 600
UpperBound = 1200
VectorSize = 30
RandomVector = [random.randint(LowerBound,UpperBound) for i in range(VectorSize)]
RandomVectorSum = 1000*30
#Sanity check that our RandomVectorSum is sensible/feasible
if LowerBound*VectorSize <= RandomVectorSum and RandomVectorSum <= UpperBound*VectorSum:
if sum(RandomVector) == RandomVectorSum:
return RandomVector
else:
RunInt()
有没有人有任何改进这个想法的建议?我的代码可能永远不会完成或遇到递归深度问题。
编辑(2012 年 7 月 9 日)
感谢 Oliver、mgilson 和 Dougal 的投入。我的解决方案如下所示。
- Oliver 对多项分布的想法很有创意
- 简而言之,(1)很可能比其他解决方案更能输出某些解决方案。Dougal 通过大数定律的简单测试/反例证明了多项解空间分布不均匀或不正态。Dougal 还建议使用 numpy 的多项式函数,它为我省去了很多麻烦、痛苦和头痛。
- 为了克服 (2) 的输出问题,我使用 RunFloat() 来给出更均匀的分布(我没有对此进行测试,所以它只是一个表面的外观)。与(1)相比,这有多大不同?我真的不知道副手。不过对我来说已经够用了。
- 再次感谢 mgilson 提供不使用 numpy 的替代方法。
这是我为此编辑所做的代码:
编辑#2(2012 年 7 月 11 日)
我意识到正态分布没有正确实现,我已将其修改为以下内容:
import random
def RandFloats(Size):
Scalar = 1.0
VectorSize = Size
RandomVector = [random.random() for i in range(VectorSize)]
RandomVectorSum = sum(RandomVector)
RandomVector = [Scalar*i/RandomVectorSum for i in RandomVector]
return RandomVector
from numpy.random import multinomial
import math
def RandIntVec(ListSize, ListSumValue, Distribution='Normal'):
"""
Inputs:
ListSize = the size of the list to return
ListSumValue = The sum of list values
Distribution = can be 'uniform' for uniform distribution, 'normal' for a normal distribution ~ N(0,1) with +/- 5 sigma (default), or a list of size 'ListSize' or 'ListSize - 1' for an empirical (arbitrary) distribution. Probabilities of each of the p different outcomes. These should sum to 1 (however, the last element is always assumed to account for the remaining probability, as long as sum(pvals[:-1]) <= 1).
Output:
A list of random integers of length 'ListSize' whose sum is 'ListSumValue'.
"""
if type(Distribution) == list:
DistributionSize = len(Distribution)
if ListSize == DistributionSize or (ListSize-1) == DistributionSize:
Values = multinomial(ListSumValue,Distribution,size=1)
OutputValue = Values[0]
elif Distribution.lower() == 'uniform': #I do not recommend this!!!! I see that it is not as random (at least on my computer) as I had hoped
UniformDistro = [1/ListSize for i in range(ListSize)]
Values = multinomial(ListSumValue,UniformDistro,size=1)
OutputValue = Values[0]
elif Distribution.lower() == 'normal':
"""
Normal Distribution Construction....It's very flexible and hideous
Assume a +-3 sigma range. Warning, this may or may not be a suitable range for your implementation!
If one wishes to explore a different range, then changes the LowSigma and HighSigma values
"""
LowSigma = -3#-3 sigma
HighSigma = 3#+3 sigma
StepSize = 1/(float(ListSize) - 1)
ZValues = [(LowSigma * (1-i*StepSize) +(i*StepSize)*HighSigma) for i in range(int(ListSize))]
#Construction parameters for N(Mean,Variance) - Default is N(0,1)
Mean = 0
Var = 1
#NormalDistro= [self.NormalDistributionFunction(Mean, Var, x) for x in ZValues]
NormalDistro= list()
for i in range(len(ZValues)):
if i==0:
ERFCVAL = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
NormalDistro.append(ERFCVAL)
elif i == len(ZValues) - 1:
ERFCVAL = NormalDistro[0]
NormalDistro.append(ERFCVAL)
else:
ERFCVAL1 = 0.5 * math.erfc(-ZValues[i]/math.sqrt(2))
ERFCVAL2 = 0.5 * math.erfc(-ZValues[i-1]/math.sqrt(2))
ERFCVAL = ERFCVAL1 - ERFCVAL2
NormalDistro.append(ERFCVAL)
#print "Normal Distribution sum = %f"%sum(NormalDistro)
Values = multinomial(ListSumValue,NormalDistro,size=1)
OutputValue = Values[0]
else:
raise ValueError ('Cannot create desired vector')
return OutputValue
else:
raise ValueError ('Cannot create desired vector')
return OutputValue
#Some Examples
ListSize = 4
ListSumValue = 12
for i in range(100):
print RandIntVec(ListSize, ListSumValue,Distribution=RandFloats(ListSize))
上面的代码可以在github上找到。这是我为学校建造的课程的一部分。user1149913,也发布了一个很好的问题解释。