python - 使用 Python 从文本文件中提取数据

Question

所以我有一个大文本文件。它包含以下格式的一堆信息：

|NAME|NUMBER(1)|AST|TYPE(0)|TYPE|NUMBER(2)||NUMBER(3)|NUMBER(4)|DESCRIPTION|

抱歉含糊不清。所有信息都采用上述格式，每个描述符之间是分隔符“|”。我希望能够在文件中搜索“NAME”并在它自己的标签中打印每个描述符，例如这个例子：

Name
Number(1):
AST:
TYPE(0):
etc....

如果我仍然感到困惑，我希望能够搜索名称，然后打印出每个用“|”分隔的信息。

任何人都可以帮忙吗？

编辑这是文本文件一部分的示例：

|特雷弗·琼斯|70|AST|白|地球|3||500|1500|住在养老院的老人|

这是我到目前为止的代码：

 with open('LARGE.TXT') as fd:
    name='Trevor Jones'
    input=[x.split('|') for x in fd.readlines()]
    to_search={x[0]:x for x in input}
    print('\n'.join(to_search[name]))

score 2 · Accepted Answer

就像是

#Opens the file in a 'safe' manner
with open('large_text_file') as fd:
    #This reads in the file and splits it into tokens, 
    #the strip removes the extra pipes  
    input = [x.strip('|').split('|') for x in fd.readlines()]
    #This makes it into a searchable dictionary
    to_search = {x[0]:x for x in input}

然后搜索

to_search[NAME]

根据您希望使用答案的格式

print ' '.join(to_search[NAME])

或者

print '\n'.join(to_search[NAME])

提醒一句，此解决方案假定名称是唯一的，如果它们不是更复杂的解决方案，则可能需要。

score 2 · Accepted Answer

首先，您需要以某种方式分解文件。我认为字典是这里最好的选择。然后你可以得到你需要的东西。

d = {}
# Where `fl` is our file object
for L in fl:
    # Skip the first pipe
    detached = L[1:].split('|')
    # May wish to process here
    d[detached[0]] = detached[1:]
# Can do whatever with this information now
print d.get('string_to_search')

python - 使用 Python 从文本文件中提取数据

2 回答 2

Related

Reference