python - 在 CSV 文件/Pandas Dataframe 中查找标题行的行号

Question

我正在尝试获取包含 CSV 文件中标题的行的索引或行号。问题是，标题行可以根据我们系统报告的输出上下移动（我无法控制更改）

代码：

ht = pd.read_csv(file.csv)
test = ht.get_loc('Code') #Code being header im using to locate the header row
csv1 = read_csv(file.csv, header=test)
df1 = df1.append(csv1) #Appending as have many files

如果我要打印测试，我希望有一个大约 4 或 5 的数字，这就是我在第二次读取“read_csv”中输入的内容

我得到的错误是它需要 1 个标题列，但我有 26 个列。我只是想使用第一个标题字符串来获取行号

谢谢：-）

编辑：

CSV 格式

This file contains the data around the volume of items blablalbla
the deadlines for delivery of items a - z is 5 days
the deadlines for delivery of items aa through zz are 3 days
the deadlines for delivery of items aaa through zzz are 1 days
code,type,arrived_date,est_del_date
a/wrwgwr12/001,kids,12-dec-18,17-dec-18
aa/gjghgj35/030,pet,15-dec-18,18-dec-18

正如您将看到的“截止日期”行是相同的，根据代码 ID，这可以是 3 或 5，因此标题行可以向上或向下更改。

我也没有写出所有 26 个列标题，不确定这是否重要。

想要DF格式

index |    code         |   type   | arrived_date | est_del_date
1     | a/wrwgwr12/001  |   kids   |   12-dec-18  | 17-dec-18
2     | aa/gjghgj35/030 |  Pet     |  15-dec-18   | 18-dec-18

希望这是有道理的..

谢谢，

score 3 · Accepted Answer

您可以使用该csv模块查找包含分隔符的第一行，然后将该行的索引作为skiprows参数提供给pd.read_csv：

from io import StringIO
import csv
import pandas as pd

x = """This file contains the data around the volume of items blablalbla
the deadlines for delivery of items a - z is 5 days
the deadlines for delivery of items aa through zz are 3 days
the deadlines for delivery of items aaa through zzz are 1 days
code,type,arrived_date,est_del_date
a/wrwgwr12/001,kids,12-dec-18,17-dec-18
aa/gjghgj35/030,pet,15-dec-18,18-dec-18"""

# replace StringIO(x) with open('file.csv', 'r')
with StringIO(x) as fin:
    reader = csv.reader(fin)
    idx = next(idx for idx, row in enumerate(reader) if len(row) > 1)  # 4

# replace StringIO(x) with 'file.csv'
df = pd.read_csv(StringIO(x), skiprows=idx)

print(df)

              code  type arrived_date est_del_date
0   a/wrwgwr12/001  kids    12-dec-18    17-dec-18
1  aa/gjghgj35/030   pet    15-dec-18    18-dec-18

python - 在 CSV 文件/Pandas Dataframe 中查找标题行的行号

1 回答 1

Related

Reference