python - 在 RLE 列表上的选择变量之间进行平均，最后一个元素有问题

Question

所以我使用运行长度编码创建了一个压缩列表。现在我试图在列表中的某些变量（例如450和180）中找到平均值。代码应该像这样工作

 a=[[180,3],[140,1],[160,1],[150,2],[450,1][180,4]]
 print mean(a)
 >[[180,3],[150,4],[450,1][180,4]]

我对此很陌生，否则我会在压缩过程中进行平均。

我坚持的是两件事：未压缩时的结果列表与原始列表的长度不同，并且我不确定如果代码不通过最后一个元素如何附加最后一个元素。我可以在我的 for 循环中使用类似的索引elif i[0].index==len(lst)，但这在计算上会很昂贵（数据集相当大）。我创建的是 for 循环之外的最终 if 语句，但结果列表的长度仍与原始列表不同。

def mean(lst):
    sm=0
    count=0
    new=[]
    for i in lst:
        if i[0] is None:
            new.append([0,1])  
        elif i[0]!=180.0 and i[0]!=450.0:
           sm+=(i[0]*i[1])
           count+=i[1]
        elif count==0:
           new.append(i)      
        else:
            new.append([sm/count,count])
            new.append(i)
            count=0
            sm=0 
    if count>0:
        new.append([sm/count,count])
    pass    
    return (new)

只是为了那些以后会研究这个问题的人，我添加了我的解决方案，它结合了压缩和平均。为了阐明目的，我在 GIS 程序中压缩路段之间的角度以创建更小的数据集。450 可以被视为 Null 值。

with arcpy.da.SearchCursor("test_loop",["angle3"]) as cursor:
    count1 = 0
    count2=0
    count3=0
    add=0
    lst=[]
    for row in cursor:
        if row[0]<180 and row[0] is not None:
            if count1>0:
                lst.append([180,count1+count3])
                count1=0
                count3=0
                pass    
            count2+=1
            add+=row[0]
        elif row[0]==180:
            if count2>0:
                lst.append([add/count2,count2+count3])
                count2=0
                count3=0
                add=0
                pass    
            count1+=1    
        elif row[0]==450 or row[0] is None:
            count3+=1
        else:
            print "Error"
            break   
    if count1>0:
        lst.append([180,count1+count3])
        count1=0
        count3=0
    elif count2>0:
        lst.append([add/count2,count2+count3])
        count2=0
        count3=0
        add=0  
    else:
        lst.append([None,count3])                      
    print lst
    del cursor
    del row

def decode(lst):
   q = []
   for i in lst:
       for x in range(i[1]):
           q.append (i[0])
   return q   

final=decode(lst)
print final               

with arcpy.da.UpdateCursor("test_loop",["curve_level"]) as cursor:
    i=0
    for row in cursor:
        row[0]=final[i]
        i+=1
        cursor.updateRow(row)
del cursor
del row

score 0 · Accepted Answer

假设您的输出中不应有重复的 180 条目，并且您的预期输出为：

[[180,7],[150,4],[450,1]]

我认为这会做到：

from collections import defaultdict
def mean(lst):
    d = defaultdict(int)
    sm, count = 0.0, 0
    for [k, v] in lst:
        if float(k) in [180.0,450.0]:
            d[k] += v
        else:
            sm += k*v
            count +=v
    if sm != 0: d[sm/count] = count
    return [list(itm) for itm in d.items()]

python - 在 RLE 列表上的选择变量之间进行平均，最后一个元素有问题

1 回答 1

Related

Reference