0

我正在玩著名的泰坦尼克号数据。我有逗号分隔的数据 csv。数据如下所示:

passengerId,survived,pclass,name,sex,age,sibSp,parch,ticket,fare,cabin,embarked
1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/5 21171,7.25,,S
2,1,1,"Cumings, Mrs. John Bradley (Florence Briggs Thayer)",female,38,1,0,PC 17599,71.2833,C85,C

我正在尝试使用pandas.csv_read,但它不起作用。

我的代码:

import pandas as pd

titanic = pd.read_csv('titanic.csv')
print(titanic.head(10))

我尝试了几个组合与 csv_read 方法的争论:sep = ',', decimal = ',', delimiter = ','我仍然得到相同的输出,即:

                                         passengerId  survived  ...  cabin  embarked
0  1,0,3,"Braund, Mr. Owen Harris",male,22,1,0,A/...       NaN  ...    NaN       NaN
1  2,1,1,"Cumings, Mrs. John Bradley (Florence Br...       NaN  ...    NaN       NaN
2  3,1,3,"Heikkinen, Miss. Laina",female,26,0,0,S...       NaN  ...    NaN       NaN

我尝试搜索其他 stackoverflow 问题,但找不到答案。谢谢您的帮助。

4

1 回答 1

0

似乎问题在于您的列中有一些逗号。

quotechar 参数可能会对您有所帮助,因为它会告诉 pandas 忽略指定字符 (") 之间的逗号

titanic = pd.read_csv('titanic.csv', quotechar='"', sep=",")
于 2019-12-19T12:55:48.830 回答