7

假设我有一个可能重叠也可能不重叠的 IP 范围列表(仅限最后一个学期):

('1.1.1.1-7', '2.2.2.2-10', '3.3.3.3-3.3.3.3', '1.1.1.4-25', '2.2.2.4-6')

我正在寻找一种方法来识别任何重叠范围并将它们合并到单个范围中。

('1.1.1.1-25', '2.2.2.2-10', '3.3.3.3-3')

当前算法的想法是将所有范围扩展为所有 IP 的列表,消除重复,排序和合并任何连续的 IP。

还有更多的python-esque算法建议吗?

4

6 回答 6

6

这是我的版本,作为一个模块。我的算法与 lunixbochs 在他的回答中提到的一个相同,并且从范围字符串到整数和返回的转换很好地模块化。

import socket, struct

def ip2long(ip):
    packed = socket.inet_aton(ip)
    return struct.unpack("!L", packed)[0]

def long2ip(n):
    unpacked = struct.pack('!L', n)
    return socket.inet_ntoa(unpacked)

def expandrange(rng):
    # expand '1.1.1.1-7' to ['1.1.1.1', '1.1.1.7']
    start, end = [ip.split('.') for ip in rng.split('-')]
    return map('.'.join, (start, start[:len(start) - len(end)] + end))

def compressrange((start, end)):
    # compress ['1.1.1.1', '1.1.1.7'] to '1.1.1.1-7'
    start, end = start.split('.'), end.split('.')
    return '-'.join(map('.'.join,
          (start, end[next((i for i in range(4) if start[i] != end[i]), 3):])))

def strings_to_ints(ranges):
    # turn range strings into list of lists of ints
    return [map(ip2long, rng) for rng in map(expandrange, ranges)]

def ints_to_strings(ranges):
    # turn lists of lists of ints into range strings
    return [compressrange(map(long2ip, rng)) for rng in ranges]

def consolodate(ranges):
    # join overlapping ranges in a sorted iterable
    iranges = iter(ranges)
    startmin, startmax = next(iranges)
    for endmin, endmax in iranges:
        # leave out the '+ 1' if you want to join overlapping ranges
        # but not consecutive ranges.
        if endmin <= (startmax + 1):
            startmax = max(startmax, endmax)
        else:
            yield startmin, startmax
            startmin, startmax = endmin, endmax
    yield startmin, startmax

def convert_consolodate(ranges):
    # convert a list of possibly overlapping ip range strings
    # to a sorted, consolodated list of non-overlapping ip range strings
    return list(ints_to_strings(consolodate(sorted(strings_to_ints(ranges)))))

if __name__ == '__main__':
    ranges = ('1.1.1.1-7',
              '2.2.2.2-10',
              '3.3.3.3-3.3.3.3',
              '1.1.1.4-25',
              '2.2.2.4-6')
    print convert_consolodate(ranges)
    # prints ['1.1.1.1-25', '2.2.2.2-10', '3.3.3.3-3']
于 2012-04-13T18:40:02.623 回答
1

一旦我遇到同样的问题。唯一的区别是我必须有效地将线段保留在列表中。这是为了进行蒙特卡罗模拟。并且必须将新随机生成的线段添加到现有的排序和合并线段中。

我使用lunixbochs的答案将算法调整为您的问题,以将 IP 转换为整数。

此解决方案允许将新 IP 范围添加到已合并范围的现有列表中(而其他解决方案依赖于对要合并的范围列表进行排序,并且不允许将新范围添加到已合并范围列表中)。它在add_range功能上通过使用bisect模块找到插入新IP范围的位置,然后删除冗余IP间隔并插入调整边界的新范围,以便新范围包含所有删除的范围。

import socket
import struct
import bisect


def ip2long(ip):
    '''IP to integer'''
    packed = socket.inet_aton(ip)
    return struct.unpack("!L", packed)[0]


def long2ip(n):
    '''integer to IP'''
    unpacked = struct.pack('!L', n)
    return socket.inet_ntoa(unpacked)


def get_ips(s):
    '''Convert string IP interval to tuple with integer representations of boundary IPs
'1.1.1.1-7' -> (a,b)'''
    s1,s2 = s.split('-')
    if s2.isdigit():
        s2 = s1[:-1] + s2
    return (ip2long(s1),ip2long(s2))


def add_range(iv,R):
    '''add new Range to already merged ranges inplace'''
    left,right = get_ips(R)
    #left,right are left and right boundaries of the Range respectively

    #If this is the very first Range just add it to the list
    if not iv:
        iv.append((left,right))
        return

    #Searching the first interval with left_boundary < left range side
    p = bisect.bisect_right(iv, (left,right)) #place after the needed interval
    p -= 1 #calculating the number of interval basing on the position where the insertion is needed


    #Interval: |----X----| (delete)    
    #Range:   <--<--|----------| (extend)
    #Detect if the left Range side is inside the found interval
    if p >=0: #if p==-1 then there was no interval found
        if iv[p][1]>= right:
            #Detect if the Range is completely inside the interval
            return #drop the Range; I think it will be a very common case

        if iv[p][1] >= left-1:
            left = iv[p][0] #extending the left Range interval
            del iv[p] #deleting the interval from the interval list
            p -= 1 #correcting index to keep the invariant


    #Intervals:   |----X----| |---X---| (delete)    
    #Range:  |-----------------------------|        
    #Deleting all the intervals which are inside the Range interval
    while True:
        p += 1
        if p >= len(iv) or iv[p][0] >= right or iv[p][1] > right:
            'Stopping searching for the intervals which is inside the Range interval'
            #there are no more intervals or
            #the interval is to the right of the right Range side
            # it's the next case (right Range side is inside the interval)
            break 
        del iv[p] #delete the now redundant interval from the interval list
        p -= 1 #correcting index to keep the invariant


    #Interval: |--------X--------| (delete)    
    #Range: |-----------|-->--> (extend)    
    #Working the case when the right Range side is inside the interval
    if p < len(iv) and iv[p][0] <= right-1:
        #there is no condition for right interval side since
        #this case would have already been worked in the previous block
        right = iv[p][1] #extending the right Range side
        del iv[p] #delete the now redundant interval from the interval list
        #No p -= 1, so that p is no pointing to the beginning of the next interval
        #which is the position of insertion


    #Inserting the new interval to the list
    iv.insert(p, (left,right))


def merge_ranges(ranges):
    '''Merge the ranges'''
    iv = []
    for R in ranges:
        add_range(iv,R)
    return ['-'.join((long2ip(left),long2ip(right))) for left,right in iv]

ranges = ('1.1.1.1-7', '2.2.2.2-10', '3.3.3.3-3.3.3.3', '1.1.1.4-25', '2.2.2.4-6')
print(merge_ranges(ranges))

输出:

['1.1.1.1-1.1.1.25', '2.2.2.2-2.2.2.10', '3.3.3.3-3.3.3.3']

这对我来说编码很有趣!谢谢你:)

于 2012-04-13T18:01:10.447 回答
1

将您的范围转换为成对的数字。这些函数将单个 IP 与整数值相互转换。

def ip2long(ip):
    packed = socket.inet_aton(ip)
    return struct.unpack("!L", packed)[0]

def long2ip(n):
    unpacked = struct.pack('!L', n)
    return socket.inet_ntoa(unpacked)

现在您可以将每个范围的边缘排序/合并为数字,然后转换回 IP 以获得良好的表示。这个关于合并时间范围的问题有一个很好的算法。


  1. 1.1.1.1-1.1.1.2将你的字符串解析1.1.1.1-2成一对数字。对于后一种格式,您可以执行以下操作:

    x = '1.1.1.1-2'
    first, add = x.split('-')
    second = first.rsplit('.', 1)[0] + '.' + add
    pair = ip2long(first), ip2long(second)
    
  2. 使用简单的数字比较合并重叠范围。

  3. 转换回字符串表示形式(仍采用后一种格式):

    first, second = pair
    first = long2ip(first) + '-' + long2ip(second).rsplit('.', 1)[1]
    
于 2012-04-13T18:14:43.040 回答
1

netaddr软件包可以满足您的要求。

请参阅使用 python netaddr cidr_merge 汇总相邻子网

https://netaddr.readthedocs.io/en/latest/tutorial_01.html#summarizing-list-of-addresses-and-subnets

于 2019-03-12T12:04:19.450 回答
0

统一你的ips格式,把范围变成一对整数。

现在任务要简单得多——“合并”整数范围。我相信有很多现有的有效算法可以做到这一点,下面只有我天真的尝试:

>>> orig_ranges = [(1,5), (7,12), (2,3), (13,13), (13,17)] # should result in (1,5), (7,12), (13,17)
>>> temp_ranges = {}    
>>> for r in orig_ranges:
        temp_ranges.setdefault(r[0], []).append('+')
        temp_ranges.setdefault(r[1], []).append('-')

>>> start_count = end_count = 0
>>> start = None
>>> for key in temp_ranges:
        if start is None:
            start = key
        start_count += temp_ranges[key].count('+')
        end_count += temp_ranges[key].count('-')
        if start_count == end_count:
            print start, key
            start = None
            start_count = end_count = 0

1 5
7 12
13 17

大体思路如下:在我们将范围一个接一个(在temp_rangesdict 中)之后,我们可以简单地通过计算原始范围的开始和结束来找到新的组合范围;一旦我们得到平等,我们就找到了一个统一的范围。

于 2012-04-13T18:10:11.020 回答
0

我有这些东西,以防你需要它们,但使用 socket/struct 可能是更好的方法

def ip_str_to_int(address):
    """Convert IP address in form X.X.X.X to an int.

    >>> ip_str_to_int('74.125.229.64')
    1249764672

    """
    parts = address.split('.')
    parts.reverse()
    return sum(int(v) * 256 ** i for i, v in enumerate(parts))


def ip_int_to_str(address):
    """Convert IP address int into the form X.X.X.X.

    >>> ip_int_to_str(1249764672)
    '74.125.229.64'

    """
    parts = [(address & 255 << 8 * i) >> 8 * i for i in range(4)]
    parts.reverse()
    return '.'.join(str(x) for x in parts)
于 2012-04-13T19:06:45.017 回答