-1

我正在使用需要与国家和运营商匹配的电话号码来分析数据。我收到了以下形式的电话号码前缀的国家和目的地(城市/运营商)映射:

Country, Destination, Country Code, Destination Code, Remarks
AAA, Some Mobile, 111, "12, 23, 34, 46",Some remarks
AAA, Some city A, 111, "55, 56, 57, 51", Some more remarks
BBB, Some city B, 222, "234, 345, 456", Other remarks

这里的数据是虚拟数据,但真实数据是相同的形式。“目的地代码”列中有很多值。所以我想把这个文件转换成适合在数据库中使用的形式。

我想到的是将其转换为如下所示的形式:

Country, Destination, Combined Code, Remarks
AAA, Some Mobile, 11112, Some remarks
AAA, Some Mobile, 11123, Some remarks
AAA, Some Mobile, 11134, Some remarks
AAA, Some Mobile, 11146, Some remarks
etc..

这将使我能够创建一个更简单的映射表。处理此类数据的最佳方法是什么?我将如何在 bash shell 脚本或 python 中为这种转换编写代码?

4

1 回答 1

1
>>> data = [['Country', 'Destination', 'Country Code', 'Destination Code', 'Remarks'],
... ['AAA', 'Some Mobile', '111', '12, 23, 34, 46','Some remarks'],
... ['AAA', 'Some city A', '111', '55, 56, 57, 51', 'Some more remarks'],
... ['BBB', 'Some city B', '222', '234, 345, 456', 'Other remarks']]
>>> 
>>> op=[data[0]]
>>> for i in data[1:]:
...    for j in i.pop(3).split(','):
...       op.append([k+j.strip() if i.index(k)==2 else k for k in i])
... 

>>> for i in op:
...    print i
... 
['Country', 'Destination', 'Country Code', 'Destination Code', 'Remarks']
['AAA', 'Some Mobile', '11112', 'Some remarks']
['AAA', 'Some Mobile', '11123', 'Some remarks']
['AAA', 'Some Mobile', '11134', 'Some remarks']
['AAA', 'Some Mobile', '11146', 'Some remarks']
['AAA', 'Some city A', '11155', 'Some more remarks']
['AAA', 'Some city A', '11156', 'Some more remarks']
['AAA', 'Some city A', '11157', 'Some more remarks']
['AAA', 'Some city A', '11151', 'Some more remarks']
['BBB', 'Some city B', '222234', 'Other remarks']
['BBB', 'Some city B', '222345', 'Other remarks']
['BBB', 'Some city B', '222456', 'Other remarks']

您更新的问题的解决方案:

>>> data = [['Country', 'Destination', 'Country Code', 'Destination Code', 'Remarks'],
...  ['AAA', 'Some Mobile', '111', '12, 23, 34, 46','Some remarks'],
...  ['AAA', 'Some city A', '111', '55, 56, 57, 51', 'Some more remarks'],
...  ['BBB', 'Some city B', '222', '234, 345, 456', 'Other remarks']]
>>>  
>>> op=[data[0]]
>>> for i in data[1:]:
...    for id,j in enumerate(i.pop(3).split(',')):
...       k=i[:]
...       k.insert(3,i[2]+j.strip())
...       op.append(k)
... 
>>> for i in op:
...    print i
... 
['Country', 'Destination', 'Country Code', 'Destination Code', 'Remarks']
['AAA', 'Some Mobile', '111', '11112', 'Some remarks']
['AAA', 'Some Mobile', '111', '11123', 'Some remarks']
['AAA', 'Some Mobile', '111', '11134', 'Some remarks']
['AAA', 'Some Mobile', '111', '11146', 'Some remarks']
['AAA', 'Some city A', '111', '11155', 'Some more remarks']
['AAA', 'Some city A', '111', '11156', 'Some more remarks']
['AAA', 'Some city A', '111', '11157', 'Some more remarks']
['AAA', 'Some city A', '111', '11151', 'Some more remarks']
['BBB', 'Some city B', '222', '222234', 'Other remarks']
['BBB', 'Some city B', '222', '222345', 'Other remarks']
['BBB', 'Some city B', '222', '222456', 'Other remarks']
于 2014-11-09T10:25:15.093 回答