csv 的第一行有标题。这是我的 csv 的示例行:
2013-07-31 00:00:00,,1.0,2013.0,7.0,Q3,21160742,32HHBS1307170203,KL0602130731,AIRFRANCE
KLM,KLM,KLM,KLM,KL,KLM ROYAL DUTCH AIRLINES,,0602,,KL0602,KL,KLM ROYAL DUTCH
AIRLINES,,,,KL,0602,,,LAX,AMS,,31-7-2013 0:00:00,2013-07-31,2013-07-31,2013-07-31,2013-07-31,
13:55:00,14:39:00,20:55:00,21:39:00,2013-08-01,2013-08-01,2013-08-01,2013-08-01,
09:05:00,09:45:00,07:05:00,07:45:00,2.0,,2,,,LAX,LOS ANGELES INTERNATIONAL AIRPORT,
LAX,LAX,5.0,LAX,LOS ANGELES,US,UNITED STATES OF AMERICA,US,USA,NA8,NORTHERN AMERICA,
AMERICAS,,,,AMS,SCHIPHOL I,F,OFFLINE,I,INDIRECT OFFLINE,14.0,3.0,FRONT,Business,2.0,nan,
PLANNED,3.0,,2.0,2.0,34.0,4.0,400254887nan,1.0,2.0,2.0,2.0,1.0,2.0,6.0,3.0,1.0,3.0,1.0,1.0,
nan,nan,nan,nan,nan,nan,nan,3.0,3.0,3.0,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,
nan,2.0,2.0,2.0,2.0,2.0,7.0,nan,2.0,3.0,3.0,3.0,3.0,nan,nan,nan,nan,nan,nan,nan,nan,nan,
nan,nan,nan,nan,6.0,1.0,nan,nan,nan,nan,nan,2.0,nan,nan,nan,nan,nan,nan,nan,nan,nan,2.0,2.0,
nan,2.0,nan,3.0,nan,,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,nan,13.7885862654653,
0.2, 34273499844164,nan,37.0,Booked,35.0,10.0,2.0,2.0,6.0,35.0,10.0,42.0,nan,nan,LAX,LAX,N
如果我使用input_file = csv.DictReader(open("file.csv")
or input_file = csv.reader(open('file.csv'))
,我所有的对象都会变成字符串。
用python打印的一行:
'2013-08-31 00:00:00', '', '1.0', '2013.0', '8.0', 'Q3','C', '03J', '', '',
'', '', 'nan', 'nan', '', 'NON-AIRPORT', 'SELF-SERVICE', 'ICI', '', '19.0', '20130819',
'1.0', '19.0', '9.0', '20130901', '2.0', '1.0', '1.0', '1.0', '10.0', '5.0', '5.0', '3.0',
'4.0', '4.0', '2.0', '2.0', '', 'nan', '2.0', '', '24854524', 'nan', 'nan', 'nan', 'nan',
'1.0', 'nan', '5.0', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan',
'nan', '4.0', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan',
'nan', 'nan', 'nan', '2.0', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan', 'nan',
'nan', '3.0', '5.0', '5.0'
如您所见,所有日期、字符串、浮点数和整数都已转换为字符串。如何正确导入它们?假设我们有 400 列数据,我无法手动定义每列的类型。