1

我有一些以下格式的配置数据。在 python 中解析这些数据的最佳方法是什么?我检查了csv模块并简要介绍了这个模块。无法弄清楚如何使用它。现有的解析器在 perl 中被黑了。

|------------+------+--------|
| ColHead1 | Col_______Head2 | 甲烷 |
|------------+------+--------|
| abcdefg000 | * | 一些v1 |
| abcdefg001 | * | 一些v2 |
| abcdefg002 | * | |
| abcdefg003 | * | |
| abcdefg004 | * | |
| abcdefg005 | * | |
| abcdefg006 | * | |
| abcdefg007 | * | |
| abcdefg008 | * | |
| abcdefg009 | * | |
| abcdefg010 | * | |
|------------+------+--------|

4

2 回答 2

2

你可以尝试这样的事情:

def parse(ascii_table):
    header = []
    data = []
    for line in filter(None, ascii_table.split('\n')):
        if '-+-' in line:
            continue
        if not header:
            header = filter(lambda x: x!='|', line.split())
            continue
        data.append(['']*len(header))
        splitted_line = filter(lambda x: x!='|', line.split())
        for i in range(len(splitted_line)):
            data[-1][i]=splitted_line[i]
    return header, data
于 2013-10-01T23:47:17.997 回答
1

如果它在文件中,这是另一种(类似)方法:

with open(filepath) as f:
    for line in f:
        if '-+-' in line or 'Head' in line:
            continue
        # strip '|' off the ends then split on '|'
        c1, c2, c3 =  line.strip('|').split('|')
        print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)

或字符串变量:

for line in ascii_table.split('\n'):
    if '-+-' in line or 'Head' in line:
        continue
    c1, c2, c3 =  line.strip('|').split('|')
    print 'Col1: {}\tCol2: {}\tCol3: {}'.format(c1,c2,c3)
于 2013-10-02T00:23:09.497 回答