python - 在对象python中查找变量的平均值

Question

如何迭代一组对象以最有效的方式找到它们的平均值？这仅使用一个循环（可能除了 Numpy 中的循环），但我想知道是否有更好的方法。目前，我正在这样做：

scores = []
ratings= []
negative_scores = []
positive_scores = []

for t in text_collection:
 scores.append(t.score)
 ratings.append(t.rating)
 if t.score < 0:
    negative_scores.append(t.score)
 elif t.score > 0:
    positive_scores.append(t.score)

print "average score:", numpy.mean(scores)
print "average rating:", numpy.mean(ratings)
print "average negative score:", numpy.mean(negative_scores)
print "average positive score:", numpy.mean(positive_scores)

有没有更好的方法来做到这一点？

score 6 · Accepted Answer

import numpy as np
scores, ratings = np.array([(t.score, t.rating) for t in text_collection]).T

print 'average score: ', np.mean(scores)
print 'average rating: ', np.mean(ratings)
print 'average positive score: ', np.mean(scores[scores > 0])
print 'average negative score: ', np.mean(scores[scores < 0])

编辑：

要检查是否真的有任何负分，你可以这样：

if np.count_nonzero(scores < 0):
    print 'average negative score: ', np.mean(scores[scores < 0])

score 1 · Accepted Answer

你介意为你想从集合中获取的每个项目循环吗？效率略低，但更清晰：

avg_score = numpy.mean([t.score for t in text_collection])
avg_rating = numpy.mean([t.rating for t in text_collection])
avg_neg_score = numpy.mean([t.rating for t in text_collection if t.score < 0])
avg_pos_score = numpy.mean([t.rating for t in text_collection if t.score > 0])

score 0 · Accepted Answer

您可以通过简单的操作从 avg_neg_score 和 avg_pos_score 获得 avg_score：

nneg = len(negative_scores)
npos = len(positive_scores)
avg_score = (avg_neg_score * nneg + avg_pos_score * npos) / (nneg + npos)

编辑：如果您通过迭代 text_collection 创建数组，这将更有效（假设您只想要手段）：

n = len(text_collection)
(npos, sumpos) = (0, 0)
(nneg, sumneg) = (0, 0)
sumrating = 0
for t in text_collection:
    sumrating += t.rating
    if t.score < 0:
        sumneg += t.score
        nneg += 1
    else:
        sumpos += t.score
        npos += 1
avg_score = (sumneg + sumpos) / n
avg_neg_score = sumneg / nneg
avg_pos_score = sumpos / npos
avg_rating = sumrating / n

编辑2：固定：avg_neg_rating 到 avg_neg_score ...

score 0 · Accepted Answer

如果你有 NumPy 可用，我认为这是你最好的选择。它完全符合您的要求，并且有一个可以自我记录您正在做的事情的名称。

如果你想要一个纯 python 解决方案：

def mean(seq):
    i = 0
    sum = 0.0
    for x in seq:
        sum += x
        i += 1
    if i == 0:
        raise ValueError, "cannot take mean of zero-length sequence"
    return sum / i

我写它来处理任何序列，包括计算值的生成器表达式之类的东西。所以它只运行一次序列，并且它有自己的计数器，所以它知道有多少。如果您确定您只想取列表的平均值：

def list_mean(lst):
    if len(lst) == 0:
        raise ValueError, "cannot take mean of zero-length list"
    return float(sum(lst)) / len(lst)

如果您在迭代器或生成器表达式上调用它，len()则将不起作用并且您将收到TypeError异常。

python - 在对象python中查找变量的平均值

4 回答 4

Related

Reference