这是基于netaddr
实现 IP 地址/网络集的包的一种可能性。
首先,考虑如果 A = A1 ∪ A2 且 B = B1 ∪ B2,则 A ∩ B = (A1 ∩ B1) ∪ (A1 ∩ B2) ∪ (A2 ∩ B1) ∪ (A2 ∩ B2)。
所以你把你的列表分解成小的集合,并使用上面的方法来逐步计算交集。例如:
from netaddr import IPSet
A1 = IPSet(['1.2.3.4','145.2.3.0/24'])
A2 = IPSet(['6.5.0.0/16','3.4.1.0/24'])
B1 = IPSet(['1.5.6.7','10.0.3.0/24'])
B2 = IPSet(['1.2.3.0/24','3.4.0.0/16'])
A1B1 = A1 & B1
A1B2 = A1 & B2
A2B1 = A2 & B1
A2B2 = A2 & B2
A1B1 | A1B2 | A2B1 | A2B2
-> IPSet(['1.2.3.4/32', '3.4.1.0/24'])
但是考虑到,在使用 IPSet 时,您不需要列出所有地址,您可以执行交叉操作,而无需将列表分解为小集合。
更新:在具有 4GB 内存的笔记本电脑上,两个 5,000 个随机定义的网络列表(长度 8 到 24 位)的交集只需要几秒钟:
制作两个 IP 地址列表:
import random
f = open('iplist1.txt','w')
for i in range(5000):
ip = '.'.join([str(random.randint(1,254)) for i in range(4)])
ip += '/'+str(random.randint(8,24))
f.write(ip+'\n')
f.close()
f = open('iplist2.txt','w')
for i in range(5000):
ip = '.'.join([str(random.randint(1,254)) for i in range(4)])
ip += '/'+str(random.randint(8,24))
f.write(ip+'\n')
f.close()
将它们相交:
import time
import netaddr
ipset1 = netaddr.IPSet(open('iplist1.txt','r').readlines())
ipset2 = netaddr.IPSet(open('iplist2.txt','r').readlines())
print "Set 1:", len(ipset1), "IP addresses"
print "Set 2:", len(ipset2), "IP addresses"
start = time.time()
ipset = ipset1 & ipset2
print "Elapsed:", time.time() - start
print "Intersection:",len(ipset),"IP addresses"