0

检查以确保

a) 每行有 4 列长

b) 确保程序末尾有新行 ('\n') 时不会失败

def ask_for_filename():
    filename=raw_input("Please enter file name: ")
    return filename

def read_data(filename):
        with open(filename) as f:
           data = f.readlines()

        i = 0
        for line in data:
            lineContains = line.split('\t')
            lineLength = len(lineContains)  #calculate elements


            i = i+1

            if lineLength < 3 and i < len(data):        
                print "File is invalid format."

        f.close()
        return data

您能否纠正我有问题的地方,因为这部分代码不起作用。

        i = 0
        for line in data:
            lineContains = line.split('\t')
            lineLength = len(lineContains)  #calculate elements


            i = i+1

            if lineLength < 3 and i < len(data):        
                print "File is invalid format."

示例文件内容:

完整文件:

AUTHOR(S)   YEAR    TITLE   JOURNAL/CONFERENCE

Accot;Zhai  2001    Scale effects in steering law tasks Proc. ACM CHI

Acredolo    1977    Developmental Changes in the Ability to Coordinate Perspectives of a Large-Scale Space  Developmental Psychology

Aginsky;Harris;Rensink;Beusmans 1997    Two strategies for learning a route in a driving simulator  Journal of Environmental Psychology

不完整的文件(上述代码适用于此类文件):

AUTHOR(S)   YEAR    TITLE   JOURNAL/CONFERENCE

Accot;Zhai  2001    Scale effects in steering law tasks Proc. ACM CHI

Acredolo    Developmental Changes in the Ability to Coordinate Perspectives of a Large-Scale Space  Developmental Psychology

Aginsky;Harris;Rensink;Beusmans 1997    Two strategies for learning a route in a driving simulator  Journal of Environmental Psychology

Agrawala;Beers;Frohlich;Hanrahan;McDowall;Bolas 1997    The two-user responsive workbench: Support for collaboration through individual views of a shared space Proc. ACM SIGGRAPH

Ahmadabadi;Eiji 1996    Cooperation strategy for a group of object lifting robots   Proc. of IROS
4

2 回答 2

1

您抱怨您的代码“不会以任何方式影响程序的其余部分”。

由于相关代码中没有任何内容可以修改任何数据或更改任何控制流,当然它不会影响程序的其余部分。所以read_data总是返回文件中的所有行,有效或无效。

由于您没有解释希望它如何影响程序的其余部分,因此很难向您展示如何做您想做的事情……但我可以向您展示如何做某事

例如,我们不返回所有行,而是只返回有效行:

i = 0
result = []
for line in data:
    lineContains = line.split('\t')
    lineLength = len(lineContains)  #calculate elements

    i = i+1

    if lineLength < 3 and i < len(data):
        print "File is invalid format."
    else:
        result.append(line)

return result

或者,引发异常而不是返回任何内容:

i = 0
for line in data:
    lineContains = line.split('\t')
    lineLength = len(lineContains)  #calculate elements

    i = i+1

    if lineLength < 3 and i < len(data):
        raise ValueError("File is invalid format.")

return data

同时,您的代码还有其他一些问题。

在块中f.close()使用后不应调用。通常你会很幸运并且它是无害的,但“通常无害且从不有用”不是你想要的那种代码。fwith

如果您想计算某事中的所有行,请不要i = i+1在循环中添加显式,只需使用enumerate.

另外,我不确定i < len(data)应该做什么,因为它总是正确的。所以我就不说了。(这意味着我也可以i完全省略,因为它是你唯一使用它的地方……但我会把它留在里面,这样我可以给你看enumerate

几乎从来没有一个很好的理由打电话readlines()。文件已经是一个充满行的可迭代文件,就像readlines返回的列表一样。您所做的只是通过一次读取整个文件而不是按需读取来强制您的代码变慢并占用更多内存。

所以,这是跳过坏行的版本:

def read_data(filename):
    result = []
    with open(filename) as f:
        for i, line in enumerate(f):
            lineContains = line.split('\t')
            lineLength = len(lineContains)  #calculate elements
            if lineLength < 3:        
                print "File is invalid format."
            else:
                result.append(line)
    return result

同时,如果可能有 100000 个无效行,您是否真的要为每个无效行打印一个警告?如果没有,您可以使这更简单:

def read_data(filename):
    def bad_line(line):
        lineContains = line.split('\t')
        lineLength = len(lineContains)  #calculate elements
        return lineLength < 3
    with open(filename) as f:
        return [line for line in f if not bad_line(line)]
于 2013-05-03T18:57:48.963 回答
0
def is_data_valid(filename):
    data = open(filename).readlines()
    lines = [x.split('\t') for x in data]
    no_newlines = [line for line in lines if len(line) > 1]
    return all(len(line) == 4 for line in no_newlines)
于 2013-05-03T18:25:21.223 回答