python - 这段 JSON 解码器的代码在做什么？

Question

我一直在使用这段代码：

def read_text_files(filename):
    # Creates JSON Decoder
    decoder = json.JSONDecoder()
    with open(filename, 'r') as inputfile:
        # Returns next item in input file, removes whitespace from it and saves it in line
        line = next(inputfile).strip()
        while line:
            try:
                # Returns 2-tuple of Python representation of data and index where data ended
                obj, index = decoder.raw_decode(line)
                # Remove object
                yield obj
                # Remove already scanned part of line from rest of file
                line = line[index:]
            except ValueError:
                line += next(inputfile).strip()                    
            if not line:
                line += next(inputfile).strip()
            global count
            count+=1
            print str(count)

all_files = glob.glob('Documents/*')
for filename in all_files:
    for data in read_text_files(filename):      
        rawTweet = data['text']
        print 'Here'

它读入一个 JSON 文件并对其进行解码。但是，我意识到，当我将 count 和 print 语句放在 ValueError 中时，我在这里丢失了几乎一半正在扫描的文档——它们永远不会回到 main 方法。

有人可以向我解释一下 try 语句在做什么，以及为什么我在 except 部分丢失了文件。是不是因为 JSON 不好？

编辑：包括更多代码

目前，随着代码的发布，机器打印：

"Here"
2
3 etc...
199
Here
200 
Here (alternating like this until)...
803
804
805 etc...
1200

发生这种情况是因为某些 JSON 已损坏吗？是因为有些文件是重复的（有些肯定是重复的）吗？

编辑2：

有趣，删除：

line=next(inputfile).strip()
while line

并将其替换为：

for line in inputfile:

似乎已经解决了这个问题。是否有一个原因？

score 0 · Accepted Answer

该try语句指定一个语句块，通过以下except块处理异常的语句块（在您的情况下只有一个）。

我的印象是，通过您的修改，您正在异常处理程序本身内部进行第二个异常触发器。这使得控制权转到更高级别的异常处理程序，甚至是在 function 之外read_text_files。如果异常处理程序中没有发生异常，则循环可以继续。

请检查是否count存在并已使用整数值（例如0）初始化。

python - 这段 JSON 解码器的代码在做什么？

1 回答 1

Related

Reference