3

While researching options for an oft-asked question (How to find the number of rows in a CSV reader before iterating), I came across the approach of using len(list(myCSVReader)). Yes, I know how klunky and potentially expensive this is, and I won't be using it. But while playing with it, I came across a really puzzling inconsistency:

#myFile.txt is a 3-line CSV file
myCSV = csv.reader(open('myFile.txt','rb'),dialect='excel')
print(len(list(myCSV)))
print(list(myCSV))

When I run this, I get:

> 3
> []

I can understand why you couldn't just use list() to convert the reader to a list of lists, but if that's the case, then how/why does a len() on this produce the correct result?

-- JDM

4

1 回答 1

6

This is what happens when you build a list from a generator.

  • Generator is an iterable object that yields items one by one. At some point it is exhausted.

  • csv.reader returns a generator.

  • When you do list(generator), the list constructor consumes all items from the generator.

  • Next time you try getting something from the generator (e.g. do list(generator) another time), it's already exhausted.

What you can do is create a list once and use it where needed:

myCSV = list(csv.reader(open('myFile.txt','rb'),dialect='excel'))
print(len(myCSV))
print(myCSV)
于 2012-12-17T21:28:58.277 回答