0

我有两个文件

文件1:

IdName1 Info1 Info2 Info3   #Info: from program1 for name1  #Info: from program2 for name1
IdName2 Info1 Info2 Info3   #Info: from program1 for name2  #Info: from program2 for name2
IdName4 Info1 Info2 Info3   #Info: from program1 for name4  
IdName3 Info1 Info2 Info3   #Info: from program1 for name3  #Info: from program2 for name3

文件2:

# ProgramInfo
# Query: IdName1 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
# ProgramInfo
# Query: IdName2 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
# ProgramInfo
# Query: IdName4 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
line3
line4

现在我需要检查“#Query:”之后的“IdName1 Info1 Info2 Info3”是否在 File2 中,如果是,我需要从 File1 中的该行拆分信息并将其插入 File2 中相应的“# ProgramInfo”行之前. 输出文件应如下所示:

输出文件:

# IdName1 Info1 Info2 Info3
# Info: from program1 for name1 
# Info: from program2 for name1
# ProgramInfo
# Query: IdName1 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
# IdName2 Info1 Info2 Info3 
# Info: from program1 for name2 
# Info: from program2 for name2
# ProgramInfo
# Query: IdName2 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
# IdName4 Info1 Info2 Info3 
# Info: from program1 for name4 
# ProgramInfo
# Query: IdName4 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
line3
line4

我现在的问题是,如何将相应的三行添加到 File2 中,我一直在尝试这样的事情:

import sys
def programs_info_comb(fileName1, fileName2):
    my_file1 = open(fileName1, "r")
    my_line1=my_file1.readlines()
    my_file2 = open(fileName2, "r")
    my_line2=my_file2.readlines() 
    for line1 in my_line1: 
        (name1, info1, info2)= line1.strip().split("\t")
        for line2 in my_line2:
            if line2.startswith("# Q"):
                name2 = line2[9:-1]
                if name1 == name2:
                    #### here Im lost how to tell where I want those next two lines to be printed
                    print "#"+" "+name1
                    print info1
                    print info2
    my_file1.close
    my_file2.close
if __name__== "__main__":
    programs_info_comb(sys.argv[1], sys.argv[2])

可能有更好更简单的方法,所有帮助将不胜感激谢谢您的时间 Daeja

4

2 回答 2

0

可能有更好的方法可以做到这一点,但这是一种有效的方法:

def programs_info_comb(f1, f2):
    tmp = open(f1, "r")
    file1 = tmp.readlines()
    tmp.close()

    tmp = open(f2, "r")
    file2 = tmp.readlines()
    tmp.close()

    for line in file1:
        content = line.split("#")[0].strip()
        for i, line in enumerate(file2.copy()):
            if line == "# Query: %s\n" %content:
                file2.insert(i-1, "# %s\n" %content)
                break

    tmp = open("output.txt", "w")
    tmp.writelines(file2)
    tmp.close()

您还应该查看评论部分中提到的数据库。

for在某些情况下也可以避免使用两个嵌套循环。但是,这是一种完全有效的方式,并且可以满足您的要求。

于 2012-12-12T12:56:22.077 回答
0
import sys
def programs_info_comb(fileName1, fileName2):
    my_file1 = open(fileName1, "r")
    my_line1=my_file1.readlines()
    my_file1.close()

    my_file2 = open(fileName2, "r")
    my_line2=my_file2.readlines() 
    my_file2.close()

    # load file1 into a dict for lookup later
    infoFor = dict()
    for line1 in my_line1: 
        parts = line1.strip().split("\t")
        infoFor[parts[0]] = parts[1:] 

    # iterate over line numbers to be able to refer previous line numbers
    for line2 in range(len(my_line2)):
        if my_line2[line2].startswith("# Q"):
            name2 = my_line2[line2][9:-1]
            # lookup
            if infoFor.has_key(name2):
                print '# ' + name2
        for info in infoFor[name2]:
                    print info
            # print programinfo and query lines
                print my_line2[line2-1],
                print my_line2[line2],
    # skip program info always
        elif my_line2[line2].startswith("# ProgramInfo"):
            pass
    # otherwise just print as is
        else:
            print my_line2[line2],

if __name__== "__main__":
    programs_info_comb(sys.argv[1], sys.argv[2])

我已将 file1 加载到字典中以供稍后查找,并将输出发送到标准输出。在发送输出之前,我已经检查了我所在的线路类型并相应地输出。

这是o / p: -

C:\>python st.py f1.txt f2.txt
# IdName1 Info1 Info2 Info3
#Info: from program1 for name1
#Info: from program2 for name1
# ProgramInfo
# Query: IdName1 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
# IdName2 Info1 Info2 Info3
#Info: from program1 for name2
#Info: from program2 for name2
# ProgramInfo
# Query: IdName2 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
# IdName4 Info1 Info2 Info3
#Info: from program1 for name4
# ProgramInfo
# Query: IdName4 Info1 Info2 Info3
# DatabaseInfo
# FiledInfo
line1
line2
line3
line4
于 2012-12-12T13:11:26.150 回答