3

我已经生成了一长串元组(格式如下)。列表中的每个元组都将时间作为第一个元素,将一个事件作为第三个成员。第二个成员始终是相同的,并从我必须处理的其他类似列表中识别该列表。元组有许多不同的第三个元素,每个元组在不同的时间值有多个条目,这是第一个元素。

我正在尝试过滤列表以删除每个事件(元组的第三个成员)的时间的最小值和最大值(元组中的第一项)。我尝试使用列表理解,但很快就感到困惑。

('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3567', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3600', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3800', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3800', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3800', 'VOLTAGE DEVIATION', 'HORIZ_G .575')
('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
'1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')

过滤后的结果是

('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP2G23.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3600', 'VOLTAGE DEVIATION', 'DIFICULT 230')
('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3800', 'VOLTAGE DEVIATION', '7MIHL G1.575')
('1.3800', 'VOLTAGE DEVIATION', 'HORIZ_G .575')
('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230')
('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5')
('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5')
'1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575')
('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')`

我正在尝试下面的代码,但出现错误。我对此很陌生,所以如果我做错了什么,请告诉我。代码中的 m1 是我从 findall 生成的元组列表。我在代码顶部导入了 ast 。

       m1 = re.findall(pattern1,wholefile)
       m1=[ast.literal_eval(t) for t in m1] 
       m1=[(float(a),b,c) for a,b,c in m1] 
       keys=sorted({t[2] for t in m1}) 
       for key in keys: 
           group=filter(lambda t: t[2]==key,m1)
           print '{}:\n\tmax: {}\n\tmin: {}'.format(key,max(group),min(group))
4

4 回答 4

4

将元组重构为 dict 会使生活更轻松。

from collections import defaultdict

d = defaultdict(list)
for t,_,v in your_tuple_list:
     d[v].append(t)

之后,d每个事件都有一个键,以及该时间段的关联时间列表。

它看起来像这样(有点):

>>> d['DNLP2G23.575']
['1.3433'....]

现在问题变成了找到每个列表的最小值和最大值;min()这很容易max()

完成后,您将按所需顺序获得数据集;您可以将其转换回元组/列表/等。

如果您热衷于,您可以将列表转换为一个set,这将消除重复时间并通过加快最小/最大速度为您节省一些时间;假设您必须计算大量元组。

您还应该将时间转换为float- 您可以在主循环中执行此操作:d[v].append(float(t)). 这是为了确保最大值和最小值正常工作。

于 2012-09-06T16:47:10.477 回答
3

为此使用itertools.groupby :

>>> import itertools
>>> import operator
>>> results = []
>>> for key, group in itertools.groupby(tuplelist, operator.itemgetter(2)):
...    group = list(group)
...    results.append(min(group))
...    results.append(max(group))
...
>>> pprint.pprint(results)
[('1.3433', 'VOLTAGE DEVIATION', 'DNLP2G23.575'),
 ('1.3467', 'VOLTAGE DEVIATION', 'DNLP2G23.575'),
 ('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575'),
 ('1.3467', 'VOLTAGE DEVIATION', 'DNLP1_G1.575'),
 ('1.3533', 'VOLTAGE DEVIATION', 'DIFICULT 230'),
 ('1.3600', 'VOLTAGE DEVIATION', 'DIFICULT 230'),
 ('1.3600', 'VOLTAGE DEVIATION', '7MIHL G1.575'),
 ('1.3800', 'VOLTAGE DEVIATION', '7MIHL G1.575'),
 ('1.3800', 'VOLTAGE DEVIATION', 'HORIZ_G .575'),
 ('1.3800', 'VOLTAGE DEVIATION', 'HORIZ_G .575'),
 ('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115'),
 ('1.3800', 'VOLTAGE DEVIATION', 'MEDBOWCO 115'),
 ('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230'),
 ('1.3800', 'VOLTAGE DEVIATION', 'STNDPSVC 230'),
 ('1.3867', 'VOLTAGE DEVIATION', 'MINERS  34.5'),
 ('1.3900', 'VOLTAGE DEVIATION', 'MINERS  34.5'),
 ('1.4233', 'VOLTAGE DEVIATION', 'FT CRK2 34.5'),
 ('1.4267', 'VOLTAGE DEVIATION', 'FT CRK2 34.5'),
 ('1.4800', 'VOLTAGE DEVIATION', 'HIPLN_G .575'),
 ('1.4833', 'VOLTAGE DEVIATION', 'HIPLN_G .575')]

笔记:

  1. 最小/最大是按顺序在元组的元素上完成的。但是,第一个元素实际上是一个字符串而不是浮点数,因此您可能需要将key参数传递给 min 和 max 以使其使用不同的值
  2. 仅当分组键的所有相同值都在列表中时,这才有效。在您的示例输出中就是这种情况,但如果不是,您可能必须先对列表进行排序。
于 2012-09-06T16:51:33.070 回答
1

这有效(只要你真的有一个第一个值为浮点数的元组列表):

keys=sorted({t[2] for t in tups})
for key in keys:
    group=filter(lambda t: t[2]==key,tups)
    print '{}:\n\tmax: {}\n\tmin: {}'.format(key,max(group),min(group))

印刷:

MIHL G1.575:
    max: (1.38, 'VOLTAGE DEVIATION', '7MIHL G1.575')
    min: (1.36, 'VOLTAGE DEVIATION', '7MIHL G1.575')
DIFICULT 230:
    max: (1.36, 'VOLTAGE DEVIATION', 'DIFICULT 230')
    min: (1.3533, 'VOLTAGE DEVIATION', 'DIFICULT 230')
DNLP1_G1.575:
    max: (1.3467, 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
    min: (1.3467, 'VOLTAGE DEVIATION', 'DNLP1_G1.575')
DNLP2G23.575:
    max: (1.3467, 'VOLTAGE DEVIATION', 'DNLP2G23.575')
    min: (1.3433, 'VOLTAGE DEVIATION', 'DNLP2G23.575')
FT CRK2 34.5:
    max: (1.4267, 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
    min: (1.4233, 'VOLTAGE DEVIATION', 'FT CRK2 34.5')
HIPLN_G .575:
    max: (1.4833, 'VOLTAGE DEVIATION', 'HIPLN_G .575')
    min: (1.48, 'VOLTAGE DEVIATION', 'HIPLN_G .575')
HORIZ_G .575:
    max: (1.38, 'VOLTAGE DEVIATION', 'HORIZ_G .575')
    min: (1.38, 'VOLTAGE DEVIATION', 'HORIZ_G .575')
MEDBOWCO 115:
    max: (1.38, 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
    min: (1.38, 'VOLTAGE DEVIATION', 'MEDBOWCO 115')
MINERS  34.5:
    max: (1.39, 'VOLTAGE DEVIATION', 'MINERS  34.5')
    min: (1.3867, 'VOLTAGE DEVIATION', 'MINERS  34.5')
STNDPSVC 230:
    max: (1.38, 'VOLTAGE DEVIATION', 'STNDPSVC 230')
    min: (1.38, 'VOLTAGE DEVIATION', 'STNDPSVC 230')

根据您的评论,听起来您确实有看起来像元组的文本。因此,要将其转换为实际的元组:

import ast

tups=[ast.literal_eval(t) for t in tups]
tups=[(float(a),b,c) for a,b,c in tups]
于 2012-09-06T17:08:12.877 回答
0

如果您只有少数元组,这可能有点过头了,但如果您有很长的列表并且能够使用外部库,那么看看pandas。假设您的变量包含元组,tuplelist那么以下给出了您想要的输出:

import pandas
df = pandas.DataFrame.from_records(tuplelist)
df = pandas.concat([df.groupby([1, 2]).min(), 
                df.groupby([1, 2]).max() ])
df = df.sort().reset_index().reindex(columns = [0,1,2])
print list(tuple(x) for x in df.values)
于 2012-09-06T19:20:50.280 回答