python - 列表与元组排序性能

Question

首先，我意识到这已经被覆盖了。其次，我的编程经验非常有限。然而，由于我很惊讶地看到这样的性能提升，并且因为我也有相对良好的基准测试，所以我想我会分享结果。

基本上我正在做的是使用 Python (IronPython) 中的脚本在 Rhinoceros 上的网格上进行 k-means 聚类。这是结果的屏幕截图。下面是我脚本中所有功能的基准。

CYCLE  4
GetProxy =  20.0
GetProxySeed =  10.0
BuildQueue =  0.0
AssignToRegion =  610.0009
SplitRegion =  10.0
RenumberSplitRegions =  30.0
FindAdjacentRegions =  10.0
FindRegionsToCombine =  130.0002
ReCombineRegions =  0.0
RenumberRegions =  0.0
ReGrowSeeds =  110.0002
CYCLE  4 TOOK =  940.0013
AssignToRegionTime Average over all cycles (20)=  613.8351

可以清楚地看出，AssignToRegion 是瓶颈。这是将网格中的每个面分配给一个簇的函数。波纹管是所涉及步骤的“伪代码”：

It starts with a queue of structure [ [ error value, region index tag, face index] , ...] 
equal in length to the number of regions. Then:

1. Pop item of least error. 
2. Check if item has already been assigned to a region. 
3. If not, append to the region of same tag. 
4. Get adjacent faces minus faces already assigned to a region. 
5. Update queue with new faces.
5.1. Get error of new faces. This basically compared the face's normal vector to the region's normal vector. The greater the
difference, the greater the error. 
5.2. Add tag of last popped face. 
5.3. Sort queue.

Repeat until queue is empty.

仅通过将队列中的每个列表[ [ 错误值，区域索引标签，面部索引] ，...]更改为元组[（错误值，区域索引标签，面部索引），，我就获得了超过2 倍的性能提升。 ..] ，虽然我真的不明白为什么变化如此戏剧性。

CYCLE  4
GetProxy =  21.0012
GetProxySeed =  8.0004
BuildQueue =  1.0001
AssignToRegion =  245.014
SplitRegion =  5.0003
RenumberSplitRegions =  32.0019
FindAdjacentRegions =  6.0004
FindRegionsToCombine =  114.0065
ReCombineRegions =  0.0
RenumberRegions =  0.0
ReGrowSeeds =  99.0057
CYCLE  4 TOOK =  553.0317
AssignToRegionTime Average over all cycles (20)=  275.21574

对于那些好奇的人，下面是整个函数的代码。前面提到的变化是在最后。

其他可能加快速度的建议非常受欢迎！

def AssignToRegion(faceNormals, vertices, faceVertexIndexes, areaFaces, adjacentFaces, regions, queue, assignedIndexes):
    ## queue = [ [error, regionIndex, index], ... ]
    ## OR queue= [ (error, regionIndex, index), ... ]

    ## Container list for the items popped from the priority list.
    assignedIndexes = set(assignedIndexes)
    ## Until the priority queue is not empty, keep popping 
    ## the item with least priority from the priority queue. 

    while queue :
        mostPriority = queue.pop()
        faceIndex =  mostPriority[2]

        ## If the index of the popped face has already 
        ## been assigned skip to the next one. 
        if faceIndex not in assignedIndexes:
            regionIndex = mostPriority[1] ## regionIndex is Int
            regions[regionIndex][1].append(faceIndex)
            assignedIndexes.add(faceIndex)
            ## Get the adjacent faces of the popped face
            ## and append them to the priority queue. 
            newAdjacentFaces = set(adjacentFaces[faceIndex])
            ## If an adjacent face has already been assigned
            ## to a region, skip it. 
            newAdjacentFaces -= assignedIndexes
            ## Append faces to priority queue.
            queue = UpdateQueue( regions[ regionIndex] , faceNormals, areaFaces, queue, newAdjacentFaces)

    return (regions)


def UpdateQueue( region, faceNormals, areaFaces, queue, newFaces ):
    regionIndex = region[0]
    proxyNormal = region[4]

    newFacesErrors = MetricError(regionIndex, newFaces, faceNormals, areaFaces, proxyNormal)
    queue.extend( newFacesErrors )
    ## This appears faster thatn queue.sort
    queue = sorted(queue, reverse = True)
    ## This returns a list queue = [ [error, regionIndex, index] , [...] ]
    return queue

def MetricError(regionIndex, faceIndexes, faceNormals, areaFaces, proxyNormal):
    errors = []

    for index in faceIndexes:
            area = areaFaces[index]
            normal = faceNormals[index]
            normalError = normal - proxyNormal

            ## This can be obtained as an attribute of the vector
            ## and it's much faster.
            # moduleNormalError = (normalError[0] ** 2 + normalError[1] ** 2 + normalError[2] ** 2)
            moduleNormalError = normalError.SquareLength
            error = moduleNormalError * area
            errors.append( [error, regionIndex, index] )
            ## OR
            errors.append( (error, regionIndex, index) )

    ## This returns a list errors = [ [error, regionIndex, index], ... ]
    ## OR errors = [ (error, regionIndex, index), ... ]
    return errors

正如@user2357112 所建议的，使用 heapq 我得到：

AssignToRegionTime average of 20 cycles =  135.80836

python - 列表与元组排序性能

0 回答 0

Related

Reference