python - 从 CSV 生成一组 API 计时

Question

我有一个 API 列表及其相应的执行时间，格式如下：

findByNameWithProduct, 108
findProductByPartNumber, 140
findProductById, 178
findByName, 99
findProductsByCategory, 260
findByNameWithCategory, 103
findByNameWithCategory, 108
findByNameWithCategory, 99
findByNameWithProduct, 20
findProductById, 134
findTopCategories, 54
findByName, 48
findSubCategories, 44
findProductByPartNumber, 70
findProductByPartNumber, 63

我正在尝试为每个独特的 API 存储最小、最大、平均和 90% 的执行时间，但不确定如何执行此操作。我考虑过使用字典，我可以检测是否已经输入了 API，但据我所知，字典只是一个名称值对，而不是多个条目。我一直在玩这样的东西，但我知道它效率不高（而且它不起作用）。我对 Python 中的数据结构不太熟悉 - 有没有人知道一种干净的方法来完成这个？

counter = 0
uniqueAPINames = set(apiNames)
for uniqueAPIName in uniqueAPINames :
    for line in lines:
        if uniqueAPIName in line:
            print line
                    #Somehow add all these up...
    counter = counter + 1

编辑：

在接受的答案的帮助下，解决方案如下：

tests = []
lines = []
files = [f for f in os.listdir(folder)]
for f in files:
    if '-data.log' in f:
        fi = open(folder + '/' + f, 'r')
        lines = lines + fi.readlines()
        fi.close()
        for line in lines:
            if ('Thread' not in line):
                lineSplit = line.strip().split(',')
                testNumber = lineSplit[2].strip()
                testName = apiData[testNumber]
                testTime = lineSplit[4].strip()
                testList = [testName, testTime]
                tests.append(testList)

d = {}
for test in tests:
    if test[0] in d:
        d[test[0]].append(test[1])
    else:
        d[test[0]] = [test[1]]

for key in d:
    print 'API Name: ' + str(key)
    d[key] = [int(i) for i in d[key]]
    d[key].sort(key=int)
    print 'Min: ' + str(d[key][0])
    print 'Max: ' + str(d[key][-1])
    print 'Average: ' + str(sum(d[key]) / len(d[key]))
    print '90th Percentile: ' + str(sum(d[key]) / len(d[key]) * .90)

score 1 · Accepted Answer

你用字典走在正确的轨道上。值可以是任何东西，在这种情况下，列表是有意义的：

d = {}
for api_name, runtime in whatever:
    if api_name in d:  # we've seen it before
        d[api_name].append(runtime)
    else:  # first time
        d[api_name] = [runtime]  # list with one entry

现在您有一个将 API 名称映射到所有运行时列表的字典。其余的清楚吗？我会对每个列表进行排序，然后找到最小值、最大值和百分位数都非常容易。

for runtimes in d.itervalues():
    runtimes.sort()

就地对所有dict的运行时列表进行排序所需的一切。

python - 从 CSV 生成一组 API 计时

1 回答 1

Related

Reference