1

我有一个 CSV 文件,它是使用 Check Point 防火墙策略中的 HTML 导出创建的。在某些情况下,每条规则都表示为几行。当规则具有多个地址源、目的地或服务时,就会发生这种情况。我需要输出仅在一行中描述每个规则。很容易区分每条规则的开始时间。在第一列中,有规则 ID,它是一个数字。

这是一个例子。绿色标记了应该移动的字符串:

http://i.imgur.com/i785sDi.jpg

让我给你看一个例子:

NO.;NAME;SOURCE;DESTINATION;SERVICE;ACTION;
1;;fwgcluster;mcast_vrrp;vrrp;accept;
;;;;igmp;;
2;Testing;fwgcluster;fwgcluster;FireWall;accept;
;;fwmgmpe;fwmgmpe;ssh;;
;;fwmgm;fwmgm;;;

我需要的,用伪代码解释,是这样的:

阅读下一行的第一列。如果有数字:计算下一行的第一列。如果那里没有数字,则将此行列中的字符串与最后一个连接(用逗号分隔)\ 并消除当前行中的文本

输出应该是这样的:

NO.;NAME;SOURCE;DESTINATION;SERVICE;ACTION;
1;;fwgcluster;mcast_vrrp;vrrp-igmp;accept;
;;;;;;
2;Testing;fwgcluster-fwmgmpe-fwmgm;fwgcluster-fwmgmpe-fwmgm;FireWall-ssh;accept;
;;;;;;
The empty lines are there only to be more clear, I don't actually need them.

谢谢!

4

2 回答 2

2

这应该让你开始

import csv

with open('data.txt', 'r') as f:
    reader = csv.DictReader(f, delimiter=';')
    for r in reader:
        print r

编辑:鉴于您需要的输出,这应该可以让您接近那里。它有点粗糙,但可以满足您的大部分需求。它检查“否”。键,如果它有一个值,它将开始一个记录。如果不是,它将将该行中的任何其他数据与记录中的等效数据连接起来。最后,当创建新记录时,旧记录会附加到结果中,这也会发生在最后以捕获最后一项。

import csv

result, record = [], None
with open('data2.txt', 'r') as f:
    reader = csv.DictReader(f, delimiter=';', lineterminator='\n')
    for r in reader:
        if r['NO.']:
            if record:
                result.append(record)
            record = r
        else:
            for key in r.keys():
                if r[key]:
                    record[key] = '-'.join([record[key], r[key]])
    if record:
        result.append(record)

print result                    
于 2013-10-10T20:48:01.617 回答
0

Graeme,再次感谢,就在您进行编辑之前,我使用以下代码解决了它。但你让我看对了方向!

如果有人需要,这里是:

import csv 
# adjust these 3 lines 
WRITE_EMPTIES = False 
INFILE = "input.csv"
OUTFILE = "output.csv"
with open(INFILE, "r") as in_file: 
  r = csv.reader(in_file, delimiter=";") 
  with open(OUTFILE, "wb") as out_file: 
    previous = None 
    empties_to_write = 0 
    out_writer = csv.writer(out_file, delimiter=";") 
    for i, row in enumerate(r): 
      first_val = row[0].strip() 
      if first_val: 
        if previous: 
          out_writer.writerow(previous) 
          if WRITE_EMPTIES and empties_to_write: 
            out_writer.writerows( 
              [["" for _ in previous]] * empties_to_write 
              ) 
            empties_to_write = 0 
        previous = row 
      else: # append sub-portions to each other 
        previous = [ 
          "|".join( 
            subitem 
            for subitem in existing.split(",") + [new] 
            if subitem 
            ) 
          for existing, new in zip(previous, row) 
          ] 
        empties_to_write += 1 
    if previous: # take care of the last row 
      out_writer.writerow(previous) 
      if WRITE_EMPTIES and empties_to_write: 
        out_writer.writerows( 
          [["" for _ in previous]] * empties_to_write 
          ) 
于 2013-10-13T00:13:05.713 回答