python - Find number of columns in csv file

Question

My program needs to read csv files which may have 1,2 or 3 columns, and it needs to modify its behaviour accordingly. Is there a simple way to check the number of columns without "consuming" a row before the iterator runs? The following code is the most elegant I could manage, but I would prefer to run the check before the for loop starts:

import csv
f = 'testfile.csv'
d = '\t'

reader = csv.reader(f,delimiter=d)
for row in reader:
    if reader.line_num == 1: fields = len(row)
    if len(row) != fields:
        raise CSVError("Number of fields should be %s: %s" % (fields,str(row)))
    if fields == 1:
        pass
    elif fields == 2:
        pass
    elif fields == 3:
        pass
    else:
        raise CSVError("Too many columns in input file.")

Edit: I should have included more information about my data. If there is only one field, it must contain a name in scientific notation. If there are two fields, the first must contain a name, and the second a linking code. If there are three fields, the additional field contains a flag which specifies whether the name is currently valid. Therefore if any row has 1, 2 or 3 columns, all must have the same.

score 30 · Accepted Answer

您可以使用itertools.tee

itertools.tee(iterable[, n=2])
从单个可迭代对象中返回 n 个独立的迭代器。

例如。

reader1, reader2 = itertools.tee(csv.reader(f, delimiter=d))
columns = len(next(reader1))
del reader1
for row in reader2:
    ...

请注意，删除reader1完成后的引用很重要 - 否则tee必须将所有行存储在内存中，以防您next(reader1)再次调用

score 20 · Accepted Answer

这似乎也有效：

import csv

datafilename = 'testfile.csv'
d = '\t'
f = open(datafilename,'r')

reader = csv.reader(f,delimiter=d)
ncol = len(next(reader)) # Read first line and count columns
f.seek(0)              # go back to beginning of file
for row in reader:
    pass #do stuff

score 4 · Accepted Answer

如果用户向您提供列数较少的 CSV 文件会怎样？是否使用默认值？

如果是这样，为什么不用空值来扩展行呢？

reader = csv.reader(f,delimiter=d)
for row in reader:
    row += [None] * (3 - len(row))
    try:
        foo, bar, baz = row
    except ValueError:
        # Too many values to unpack: too many columns in the CSV
        raise CSVError("Too many columns in input file.")

现在 bar 和 baz 至少会是None，异常处理程序将处理任何超过 3 个项目的行。

score 3 · Accepted Answer

我会建议这样一个简单的方法：

with open('./testfile.csv', 'r') as csv:
     first_line = csv.readline()
     your_data = csv.readlines()

ncol = first_line.count(',') + 1

score -1 · Accepted Answer

我将按如下方式重建它（如果文件不是太大）：

import csv
f = 'testfile.csv'
d = '\t'

reader = list(csv.reader(f,delimiter=d))
fields = len( reader[0] )
for row in reader:
    if fields == 1:
        pass
    elif fields == 2:
        pass
    elif fields == 3:
        pass
    else:
        raise CSVError("Too many columns in input file.")

python - Find number of columns in csv file

5 回答 5

Related

Reference