-2

我有5个文件如下

F1:

ABL1
ABR
AHSG
AIRE
CCKBR
LRBA
CDC27
CENPA

F2:

AKT1
APC
APP
AR
CCND1
C11ORF2
CCNE1
CST6
CTNNB1
DBI
DEFA1
DNMT1
EEF1A1
EEF1G

F3:

ACTG1
AMPH
ANK3
APBA2
APOA1
ARHGDIA
ATP5J
DST
CA1
CA12
DDR1
CALR
CASP6

F4:

ACVR1
ARL4D
RHOA
RHOG
RHOH
BMPR1B
BMPR2
CDC20
CDK4
CDK6
CHN1

F5:

A1BG
A2M
AAMP
ACTB
ADD1
ALAS1
ALB
APLP1
ASNA1
ATP5B

我尝试了以下代码

file1=open("F1.txt","r")
file2=open("F2.txt","r")
file3=open("F3.txt","r")
file4=open("F4.txt","r")
file5=open("F5.txt","r")

list1=file1.readlines()
list2=file2.readlines()
list3=file3.readlines()
list4=file4.readlines()
list5=file5.readlines()

for line1 in list1:
    for line2 in list2:
        for line3 in list3:
            for line3 in list4:
                for line4 in list5:
                    if line1.strip() in line2.strip() in line3.strip() in line4.strip() in line5.strip():
                        print line1
                        file3.write(line1)

现在我想比较所有这些文件并使用 python 脚本找出常用词有人可以帮忙吗?我可以使用套装吗?

4

1 回答 1

3

它更容易(虽然我在打开文件时没有做任何错误检查):

filenames = ["F1.txt", "F2.txt", "F3.txt", "F4.txt", "F5.txt"]
files = [open(name) for name in filenames]
sets = [set(line.strip() for line in file) 
            for file in files]
common = set.intersection(*sets)
for file in files: file.close()
于 2013-02-03T14:10:32.507 回答