下面的代码片段来自我的一个函数,它传递了一个数字列表,应该从列表中删除异常值(即非常大或非常小的数字)。
正如输出证实的那样,代码似乎没有按预期工作:
EXTREMA_CUTOFF_THRESHOLD=3.0
if list_values:
avg_val = sum(list_values)/float(len(list_values))
print 'DEBUG: BEFORE:', min(list_values), max(list_values), avg_val
list_values = [x for x in list_values if math.fabs(x - avg_val)/float(avg_val) < EXTREMA_CUTOFF_THRESHOLD]
list_values_len = len(list_values)
if (list_values_len > 0) and (min_sample_size > 0) and (list_values_len < min_sample_size):
print 'DEBUG: Insufficient data for stats calculation for row'
elif (list_values_len > 0):
print 'DEBUG: AFTER:', min(list_values), max(list_values), avg_val
输出:
DEBUG: BEFORE: 11.0 302.0 113.897260274
DEBUG: AFTER: 11.0 302.0 113.897260274
DEBUG: BEFORE: 12.5 273.0 108.382352941
DEBUG: AFTER: 12.5 273.0 108.382352941
DEBUG: BEFORE: 2.5 245.5 69.9166666667
DEBUG: AFTER: 2.5 245.5 69.9166666667
DEBUG: BEFORE: 136.5 499.5 363.775
DEBUG: AFTER: 136.5 499.5 363.775
DEBUG: BEFORE: 39.5 422.5 166.035759097
DEBUG: AFTER: 39.5 422.5 166.035759097
DEBUG: BEFORE: 39.5 422.0 152.305007587
DEBUG: AFTER: 39.5 422.0 152.305007587
DEBUG: BEFORE: 20.5 331.0 84.41015625
DEBUG: AFTER: 20.5 331.0 84.41015625
DEBUG: BEFORE: 7.0 267.5 155.810126582
DEBUG: AFTER: 7.0 267.5 155.810126582
为什么没有过滤掉极值?