Could I check, by before, do you mean immediately before (ie. t1_a, t2_b, t2_c, t2_d
should just give the pair (t1_a, t2_b)
, or do you want all pairs where a type1 value occurs anywhere before a type2 one within the same block. (ie (t1_a, t2_b), (t1_a, t2_c), (t1_a, t2_d)
for the previous example).
In either case, you should be able to do this with a single pass over your list (assuming sorted by id, then start index).
Here's a solution assuming the second option (every pair):
import itertools, operator
def find_t1_t2(seq):
"""Find every pair of type1, type2 values where the type1 occurs
before the type2 within a block with the same id.
Assumes sequence is ordered by id, then start location.
Generates a sequence of tuples of the type1,type2 entries.
"""
for group, items in itertools.groupby(seq, operator.itemgetter(0)):
type1s=[]
for item in items:
if item[1] == TYPE1:
type1s.append(item)
elif item[1] == TYPE2:
for t1 in type1s:
yield t1 + item[1:]
If it's just immediately before, it's even simpler: just keep track of the previous item and yield the tuple whenever it is type1 and the current one is type2.
Here's an example of usage, and the results returned:
l=[[1, TYPE1, 10, 15],
[1, TYPE2, 20, 25], # match with first
[1, TYPE2, 30, 35], # match with first (2 total matches)
[2, TYPE2, 10, 15], # No match
[2, TYPE1, 20, 25],
[2, TYPE1, 30, 35],
[2, TYPE2, 40, 45], # Match with previous 2 type1s.
[2, TYPE1, 50, 55],
[2, TYPE2, 60, 65], # Match with 3 previous type1 entries (5 total)
]
for x in find_t1_t2(l):
print x
This returns:
[1, 'type1', 10, 15, 'type2', 20, 25]
[1, 'type1', 10, 15, 'type2', 30, 35]
[2, 'type1', 20, 25, 'type2', 40, 45]
[2, 'type1', 30, 35, 'type2', 40, 45]
[2, 'type1', 20, 25, 'type2', 60, 65]
[2, 'type1', 30, 35, 'type2', 60, 65]
[2, 'type1', 50, 55, 'type2', 60, 65]