2

我对这个论坛、编程和 Python 都很陌生。我正在尝试开发我的第一个程序,但是在一个特定问题上我一直遇到困难。我很高兴某种鞋底可以让我摆脱痛苦并告诉我如何正确地做我想做的事。如果您知道自己在做什么,我敢肯定这很简单,但是目前我很愚蠢并且不知道自己在做什么:-)

例子:

我需要处理 2 个文件,A & B

文件 A 包含以下文本:

This is a test

虽然文件 B 包含文本:

h
t
s
i
a

我需要创建一个程序,该程序一次从文件 A 中获取 1 个字符,然后在文件 B 中搜索以查找相同的字符。一旦程序找到匹配项,我希望它打印找到匹配项的行号,然后继续从文件 A 中获取另一个字符并重复此过程直到 EOF。

4

3 回答 3

2

好的,让我们一步一步来。首先,我会将文件读B入一个非常适合快速查找的结构,因为我们将经常这样做:

chars = {}
with open("B") as lookupfile:
    for number,line in enumerate(lookupfile):
        chars[line.strip()] = number

现在我们有一个字典chars,其中包含作为键的字母和作为值的行号:

>>> chars
{'t': 1, 'a': 4, 'i': 3, 'h': 0, 's': 2}

现在我们可以遍历第一个文件。文件的标准 Python 迭代器每次迭代消耗一行 ,而不是一个字符,因此最好将整个文件读入一个字符串,然后对其进行迭代(因为对于字符串,迭代是逐个字符的):

with open("A") as textfile:
    text = textfile.read()

现在我们遍历字符串并打印匹配值:

for char in text:
    if char in chars:
        print("Character {0} found in row {1}".format(char, chars[char]))

如果您不喜欢两次访问字典,也可以使用

for char in text:
    found = chars.get(char):    # returns None if char isn't a key in chars
    if found:
        print("Character {0} found in row {1}".format(char, found))

或者,使用异常:

for char in text:
    try:
        print("Character {0} found in row {1}".format(char, chars[char]))
    except KeyError:
        pass
于 2013-06-16T14:09:56.763 回答
0
import os
fA = open('C:\\Desktop\\fileA.txt', 'r')
fB = open('C:\\Desktop\\fileB.txt', 'r')

fileb_content = []
for line in fB:
    fileb_content.append(fB.read().split('\n'))

rA = fA.readline().split('\n')[0]

for c in list(rA):
        if(c.strip()):
            if(c.lower() in fileb_content[0]):
                print(fileb_content[0].index(c.lower()))

在这里我测试那个字符不为空。

于 2013-06-16T14:23:25.480 回答
0

首先读取文件A并将其内容存储在变量中(使用file.read)。

with open('A.txt') as f:

    data = f.read()  # now data is: "This is a test"
    # now data is string that dontains all data of the file A.
    # But as searching a character in a string is an O(N) operation
    # so we must convert this string to a better data-structure.
    # As we need the item as well as their first index so we
    # should create a dict here, with character as the key and
    # it's first appearance(first index) as it's value. 
    # Dicts provide O(1) lookup.

    dic = {}
    for ind, char in enumerate(data):
        # store the character in dict only if it is an alphabet
        # also check if it's already present is dict or not.
        if char.isalpha() and char not in dic:
            dic[char] = ind
    #dic is {'a': 8, 'e': 11, 'i': 2, 'h': 1, 's': 3, 'T': 0, 't': 10}

现在打开文件B并使用for循环对其进行迭代,文件迭代器上的for循环一次返回一行。(内存有效方法)。

with open('B.txt') as f:
    for char in f:            #iterate one line at a time 
        char = char.strip()   #str.strip strips off whitespaces like '\n'
        if char in dic:
           print dic[char]     # if character is found in dict then
                              # print it's value, i.e index
...             
1
10
3
2
8
于 2013-06-16T14:15:12.430 回答