0

I have a text file in which i list instructions (e.g. go to website, click on link). What i want to do is complete each action in [data] according to the steps in [steps]. Now, i already have a mechanism to extract the files, but am not able to do each action based on each step.

File parsing:

file_name = "testdata.txt"
with open(file_name) as f:
      pre_data,post_data =[s.strip() for s in (f.read()).split("[data]")]
post_data_lines = post_data.splitlines()
headers = post_data_lines[0].split()
headers2 = [s.replace("_"," ").strip() for s in headers]
for line in post_data_lines[1:]:
    tmpline  = []
    pos = 0
    for itm in headers:
        tmpline.append(line[pos:pos+len(itm)])
        pos += len(itm)+1

    print dict(zip(headers2,tmpline))

This is what the text file looks like:

[Steps]
step1 = WebAddress
step2 = Tab
step3 = SecurityType
step4 = Criteria
step5 = Date
Step6 = Click1
step7 = Results
step8 = Download
[data]
WebAddress___________________________ Destination___________ Tab_____________ SecurityType___________________________________________________ Criteria___ Date_______ Click1_ Results_ Download    
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Interim MBS: Single-Family                                      Issue Date  09/01/2012  Search  100      CSV XML
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Preliminary Mega: Fannie Mae/Ginnie Mae backed Adjustable Rate  Issue Date  09/01/2012  Search  100      CSV XML
https://mbsdisclosure.fanniemae.com/  q:\\%s\\raw\\fnmapool  Advanced Search  Preliminary Mega: Fannie Mae/Ginnie Mae backed Fixed Rate       Issue Date  09/01/2012  Search  100      CSV XML
4

2 回答 2

0

下面的所有伪代码...

这是你想要做的吗?

class StairMaster(object):
    def step1(self,data):
        pass
    def step2(self,data):
        pass

for line in testdata_dicts:
    dispatcher = StairMaster()
    for i in range(1,8):
        step = "step%s" % i
        data = line[step]
        result = getattr( dispatcher , step )( data )

或者这是你想要做的:

instructions = dict( 'what the steps are' )
for line in testdata_dicts:
    for i in range(1,MAX_STEPS):
        step = "step%s" % i 
        if instructions[step] == 'DOWNLOAD':
            download()
        elif instructions[step] == 'UPLOAD':
            upload()
        ...
于 2012-08-31T14:29:31.020 回答
0

你没有说你的代码是如何破坏的——那会有所帮助。

我试图看看你的代码做了什么,并观察到列的读取方式偏离了 3 个位置。这是由于\\每行中的三个。但是我不知道您提供的输入文本是否准确,并且您确实最终得到了该文本。让我知道。

...
header_names = [s.replace("_"," ").strip() for s in headers]
for line in post_data_lines[1:]:
    columns  = []
    column_start = 0
    print 'line %r.' % line
    for header in headers:
        column = line[column_start:column_start+len(header)]
        print 'read column %r.' % column
        print 'used header %r.' % header
        columns.append('%s' % column)
        column_start += len(header) + 1
    break

这输出:

https://mbsdisclosure.fanniemae.com/  q:\%s\raw\fnmapool  Advanced Search  Interim MBS: Single-Family                                      Issue Date  09/01/2012  Search  100      CSV XML
read column 'https://mbsdisclosure.fanniemae.com/ '.
used header 'WebAddress___________________________'.
read column 'q:\\%s\\raw\\fnmapool  Ad'.
used header 'Destination___________'.
read column 'anced Search  In'.
used header 'Tab_____________'.
read column 'erim MBS: Single-Family                                      Is'.
used header 'SecurityType___________________________________________________'.
read column 'ue Date  09'.
used header 'Criteria___'.
read column '01/2012  Se'.
used header 'Date_______'.
read column 'rch  10'.
used header 'Click1_'.
read column '      CS'.
used header 'Results_'.
read column ' XML'.
used header 'Download'.

您会看到从第二列开始,它减少了 3 个字符。

解决此问题的方法是通过三个下划线缩短写入文件中的 Destination 标头:

Destination___________
Destination________
于 2012-09-01T00:27:13.550 回答