0

我想以高级模式读取文件。

第一的:

在这个文件中,我有一些代码必须遵循的步骤,我如何阅读这些步骤,直到字符串[data]出现。

[Steps]
step1 = WebAddress
step2 = Tab
step3 = SecurityType
step4 = Criteria
step5 = Date
step6 = Click1
step7 = Results
step8 = Download
[data]
......

第二: 我怎样才能阅读[数据]之后的所有内容。

[data]
WebAddress___________________________ Destination___________ Tab_____________ SecurityType___________________________________________________ Criteria___ Date_______ Click1_ Results_ Download    
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Interim MBS: Single-Family                                      Issue Date  09/01/2012  Search  100      CSV XML
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Preliminary Mega: Fannie Mae/Ginnie Mae backed Adjustable Rate  Issue Date  09/01/2012  Search  100      CSV XML
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Preliminary Mega: Fannie Mae/Ginnie Mae backed Fixed Rate       Issue Date  09/01/2012  Search  100      CSV XML

我想通过step____________________where 步骤下的所有内容可以是步骤(例如 WebAddress)。例如,如果step1 = WebAddress我如何阅读下面的所有内容WebAddress__________________________等等?谢谢!

4

2 回答 2

0

第一的:

 with open(file_name) as f:
      print (f.read()).split("[data]")

第二:

with open(file_name) as f:
      pre_data,post_data =[s.strip() for s in (f.read()).split("[data]")]
post_data_lines = post_data.splitlines()
headers = post_data_lines[0].split()
print headers
for line in post_data_lines[1:]:
      print line.split()
      print dict(zip(headers,line.split()))

我也不确定您的 [数据] 是如何分隔的,line.split('\t')如果它是标签式的

这是未经测试的......但它应该可以工作,它并不能完全让你到达你想要的地方,但至少它得到了你想要的大部分(“硬”部分)

按标题宽度拆分使用

file_name = "testdata.txt"
with open(file_name) as f:
      pre_data,post_data =[s.strip() for s in (f.read()).split("[data]")]
post_data_lines = post_data.splitlines()
headers = post_data_lines[0].split()

for line in post_data_lines[1:]:
    tmpline  = []
    pos = 0
    for itm in headers:
        tmpline.append(line[pos:pos+len(itm)])
        pos += len(itm)+1

    print dict(zip(headers,tmpline))

如果您想要没有__的实际标题,请使用

file_name = "testdata.txt"
with open(file_name) as f:
      pre_data,post_data =[s.strip() for s in (f.read()).split("[data]")]
post_data_lines = post_data.splitlines()
headers = post_data_lines[0].split()
headers2 = [s.replace("_"," ").strip() for s in headers]
for line in post_data_lines[1:]:
    tmpline  = []
    pos = 0
    for itm in headers:
        tmpline.append(line[pos:pos+len(itm)])
        pos += len(itm)+1

    print dict(zip(headers2,tmpline))
于 2012-08-30T17:21:46.613 回答
0

第一步:

>>> import ConfigParser
>>> cfg = ConfigParser.RawConfigParser()
>>> with open('sample.cfg') as f:
...     cfg.readfp(f)
... 
>>> cfg.get('Steps','step1')
'WebAddress'

第二步:

>>> data_section = ''
>>> with open('sample.cfg') as f:
...    data_section = f.read()
... 
>>> data = data_section[data_section.index('[data]')+len('[data]')+1:]
>>> reader = csv.reader(io.BytesIO(data),delimiter='\t')
>>> reader.next() # skips header
>>> results = [row for for row in reader]

现在 results 是一个列表列表,每个内部列表都有来自数据部分的项目。

[['https://mbsdisclosure.fanniemae.com/','q:\\\\%s\\\\raw\\\\fnmapool','Advanced Search', 'Interim MBS: Single-Family', 'Issue Date','09/01/2012','Search','100', 'CSV XML']...]
于 2012-08-30T21:05:51.227 回答