该csv
模块似乎建议使用csv 嗅探器来解决这个问题。
他们给出了以下示例,我已针对您的情况进行了调整。
with open('example.csv', 'rb') as csvfile: # python 3: 'r',newline=""
dialect = csv.Sniffer().sniff(csvfile.read(1024), delimiters=";,")
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
# ... process CSV file contents here ...
让我们试试看。
[9:13am][wlynch@watermelon /tmp] cat example
#!/usr/bin/env python
import csv
def parse(filename):
with open(filename, 'rb') as csvfile:
dialect = csv.Sniffer().sniff(csvfile.read(), delimiters=';,')
csvfile.seek(0)
reader = csv.reader(csvfile, dialect)
for line in reader:
print line
def main():
print 'Comma Version:'
parse('comma_separated.csv')
print
print 'Semicolon Version:'
parse('semicolon_separated.csv')
print
print 'An example from the question (kingdom.csv)'
parse('kingdom.csv')
if __name__ == '__main__':
main()
我们的样本输入
[9:13am][wlynch@watermelon /tmp] cat comma_separated.csv
test,box,foo
round,the,bend
[9:13am][wlynch@watermelon /tmp] cat semicolon_separated.csv
round;the;bend
who;are;you
[9:22am][wlynch@watermelon /tmp] cat kingdom.csv
ReleveAnnee;ReleveMois;NoOrdre;TitreRMC;AdopCSRegleVote;AdopCSAbs;AdoptCSContre;NoCELEX;ProposAnnee;ProposChrono;ProposOrigine;NoUniqueAnnee;NoUniqueType;NoUniqueChrono;PropoSplittee;Suite2LecturePE;Council PATH;Notes
1999;1;1;1999/83/EC: Council Decision of 18 January 1999 authorising the Kingdom of Denmark to apply or to continue to apply reductions in, or exemptions from, excise duties on certain mineral oils used for specific purposes, in accordance with the procedure provided for in Article 8(4) of Directive 92/81/EEC;U;;;31999D0083;1998;577;COM;NULL;CS;NULL;;;;Propos* are missing on Celex document
1999;1;2;1999/81/EC: Council Decision of 18 January 1999 authorising the Kingdom of Spain to apply a measure derogating from Articles 2 and 28a(1) of the Sixth Directive (77/388/EEC) on the harmonisation of the laws of the Member States relating to turnover taxes;U;;;31999D0081;1998;184;COM;NULL;CS;NULL;;;;Propos* are missing on Celex document
如果我们执行示例程序:
[9:14am][wlynch@watermelon /tmp] ./example
Comma Version:
['test', 'box', 'foo']
['round', 'the', 'bend']
Semicolon Version:
['round', 'the', 'bend']
['who', 'are', 'you']
An example from the question (kingdom.csv)
['ReleveAnnee', 'ReleveMois', 'NoOrdre', 'TitreRMC', 'AdopCSRegleVote', 'AdopCSAbs', 'AdoptCSContre', 'NoCELEX', 'ProposAnnee', 'ProposChrono', 'ProposOrigine', 'NoUniqueAnnee', 'NoUniqueType', 'NoUniqueChrono', 'PropoSplittee', 'Suite2LecturePE', 'Council PATH', 'Notes']
['1999', '1', '1', '1999/83/EC: Council Decision of 18 January 1999 authorising the Kingdom of Denmark to apply or to continue to apply reductions in, or exemptions from, excise duties on certain mineral oils used for specific purposes, in accordance with the procedure provided for in Article 8(4) of Directive 92/81/EEC', 'U', '', '', '31999D0083', '1998', '577', 'COM', 'NULL', 'CS', 'NULL', '', '', '', 'Propos* are missing on Celex document']
['1999', '1', '2', '1999/81/EC: Council Decision of 18 January 1999 authorising the Kingdom of Spain to apply a measure derogating from Articles 2 and 28a(1) of the Sixth Directive (77/388/EEC) on the harmonisation of the laws of the Member States relating to turnover taxes', 'U', '', '', '31999D0081', '1998', '184', 'COM', 'NULL', 'CS', 'NULL', '', '', '', 'Propos* are missing on Celex document']
还可能值得注意的是我使用的是什么版本的 python。
[9:20am][wlynch@watermelon /tmp] python -V
Python 2.7.2