python - 如何在python中随机合并两个列表的两个元素并确保结果条目都是唯一的？

Question

我有两个列表，A 和 B，具有相同数量的元素，尽管每个列表中的元素不一定不同。

我想通过随机耦合来自 A 和 B 的元素来形成一个新列表（随机配对很重要）。

但是，我还需要确保结果列表中的每一对都是唯一的。

到目前为止，我一直在解决以下问题，它适用于小型列表，但显然不适合具有多种组合的大型列表。

from random import shuffle

# Create a list of actors and events for testing
events = ['P1','P1','P1','P2','P2','P2','P3','P3','P3','P4','P5','P6','P7','P7']
actors = ['IE','IE','ID','ID','IA','IA','IA','IC','IB','IF','IG','IH','IH','IA']

# Randomize the elements of each list
shuffle(events)
shuffle(actors)

# Merge the two lists into a new list of pairs
edgelist = zip(events,actors)

# If the new list of pairs has all unique elements, then it is a good solution, otherwise try again at random
x = set(edgelist)

if len(edgelist) == len(x):
  break
else:
  while True:
    shuffle(events)
    shuffle(actors)
    edgelist = zip(events,actors)
    x = set(edgelist)
    if len(edgelist) == len(x):
      break

# Display the solution
print 'Solution obtained: '
for item in edgelist:
  print item

任何人都可以提出一种修改或替代方法来扩展到更大的输入列表吗？

感谢您的帮助。

更新

事实证明，这是一个比最初想象的更具挑战性的问题。我想我现在有一个解决方案。它可能无法很好地扩展，但适用于中小型列表。它会在开始之前检查解决方案是否可行，因此不需要对输入列表的分布进行假设。我还包含了几行代码来显示结果列表的频率分布与原始列表匹配。

# Randomize the elements
shuffle(events)

# Make sure a solution is possible
combinations = len(set(events))*len(set(actors))
assert combinations >= len(events) and combinations >= len(actors) and len(events) == len(actors), 'No soluton possible!'

# Merge the two lists into a new list of pairs (this will contain duplicates)
edgelist = zip(events,actors)

# Search for duplicates
counts = collections.Counter(edgelist)
duplicates = [i for i in counts if counts[i] > 1]
duplicate_count = len(duplicates)

while duplicate_count > 0:

  # Get a single duplicate to address
  duplicate = duplicates[0]

  # Find the position of the duplicate in the in edgelist
  duplicate_pos = edgelist.index(duplicate)

  # Search for a replacement
  swap = choice(edgelist)
  swap_pos = edgelist.index(swap)

  if (swap[0],duplicate[1]) not in edgelist: 
    edgelist[duplicate_pos] = (swap[0],duplicate[1])
    edgelist[swap_pos] = (duplicate[0],swap[1])

  # Update duplicate count
  counts = collections.Counter(edgelist)
  duplicates = [i for i in counts if counts[i] > 1]
  duplicate_count = len(duplicates)


# Verify resulting edgelist and frequency distributions

print 'Event Frequencies: '
print sorted([y for (x,y) in list(collections.Counter(events).items())], reverse=True)

print 'Edgelist Event Frequencies: '        
print sorted([y for (x,y) in list(collections.Counter([x for (x,y) in edgelist]).items())], reverse=True)

print 'Actor Frequencies: '        
print sorted([y for (x,y) in list(collections.Counter(actors).items())], reverse=True)

print 'Edgelist Actor Frequencies: '        
print sorted([y for (x,y) in list(collections.Counter([y for (x,y) in edgelist]).items())], reverse=True)

assert len(set(edgelist)) == len(events) == len(actors)

score 1 · Accepted Answer

好吧，您没有理由对这两个列表进行洗牌。配对不会变得“更随机”。

更新：我发布了与原始解决方案不同的解决方案。它是递归的，并且保证总是返回一个有效的答案，或者如果不可能，则返回 None。

from random import shuffle

def merge(events, actors, index=None):
    """Returns list of unique pairs of events and actors, none of which may on index"""
    if len(events) == 0:
        return []
    if index is None:
        index = set()

    merged = None
    tried_pairs = set()
    for event in events:
        for i, actor in enumerate(actors):
            pair = (event, actor)
            if pair not in index and pair not in tried_pairs:
                new_index = index.union([pair])
                rest = merge(events[1:], actors[:i]+actors[i+1:], new_index)

                if rest is not None:
                    # Found! Done.
                    merged = [pair] + rest
                    break
                else:
                    tried_pairs.add(pair)
        if merged is not None:
            break

    return merged


if __name__ == '__main__':
    _events = ['P1','P1','P1','P2','P2','P2','P3','P3','P3','P4','P5','P6','P7','P7']
    _actors = ['IE','IE','ID','ID','IA','IA','IA','IC','IB','IF','IG','IH','IH','IA']
    shuffle(_events)
    edgelist = merge(_events, _actors)

    if edgelist is not None:
        print 'Solution obtained: '
        for item in edgelist:
            print item
    else:
        print 'No possible solution with these events and actors.'

注意：在 OP 的解决方案中，“index”变量类似于检查 edgelist。“tried_pairs”变量只是对每个特定递归步骤的优化，以避免一遍又一遍地重试相同的对（例如，如果参与者中有几个连续的相同项目）。

score 0 · Accepted Answer

这似乎工作正常，但是我不完全确定它的扩展性如何......

# Create tuples of lists of actors and events for testing
testcases = [
    (['P1','P1','P1','P2','P2','P2','P3','P3','P3','P4','P5','P6','P7','P7'],
     ['IE','IE','ID','ID','IA','IA','IA','IC','IB','IF','IG','IH','IH','IA']),

    (['P1','P1','P1','P1','P1','P1','P1','P2','P2','P2','P2','P2','P2','P2'],
     ['IE','IE','IE','IE','IE','IE','IE','ID','ID','ID','ID','ID','ID','ID']),

    (['P1','P2','P3','P4','P5','P6','P7','P8','P9','PA','PB','PC','PD','PE'],
     ['IE','IE','IE','IE','IE','IE','IE','ID','ID','ID','ID','ID','ID','ID']),
]

events, actors = testcases[0]

import random

def random_choice(items):
    """ yield list items in random order """
    items = items[:]  # preserve input argument
    random.shuffle(items)
    while items:
        yield items.pop()

pairs = set()
for event in random_choice(events):
    for index in random_choice(range(len(actors))):
        pair = (event, actors[index])
        if pair not in pairs:  # unique?
            pairs.add(pair)
            actors.pop(index)
            break

# Display the solution (in sorted order)
print 'Solution obtained has %d unique pairs: ' % len(pairs)
for item in sorted(list(pairs)):
    print item

我使用了问题的答案中的想法如何从列表中选择随机元素并将其删除？.

score -1 · Accepted Answer

# Create a list of actors and events for testing
events = ['P1','P1','P1','P2','P2','P2','P3','P3','P3','P4','P5','P6','P7','P7']
actors = ['IE','IE','ID','ID','IA','IA','IA','IC','IB','IF','IG','IH','IH','IA']

# Randomize the elements
shuffle(events)

# Merge the two lists into a new list of pairs
edgelist = zip(events,actors)

# remove duplicates
x = set(edgelist)

# If there were not enough unique elements : add new ones as needed.
while len(x)<len(edgelist):
    x.add((choice(events), choice(actor)))

# Display the solution
print 'Solution obtained: '
for item in x:
  print item

python - 如何在python中随机合并两个列表的两个元素并确保结果条目都是唯一的？

3 回答 3

Related

Reference