0

我有这个 csv 文件,我想在第 20 和第 21 字段进行排序。例如,这些字段中的数据是 P1,PK5。我的挑战是,当我对这些字段进行排序时,它们并没有像我希望的那样排列。似乎我必须将这些字段填充到该字段数据中的最长值。

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK5","2031100094470495539729170204309","3GH000503","August 26, 2013"
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK1","3031100094470495580529020291210","3GH000503","August 26, 2013"
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK2","3031100094470495583729061944757","3GH000503","August 26, 2013"
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013"
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013"
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013"
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013"

所以从上面的数据我需要让文件看起来像这样:

OrderNum,MerrillRecipientID,CustomerClass,MerrillItemNum,PODTemplateID,GridCode,AetnaDocID,MemberID,FirstName,MI,LastName,Address1,Address2,Address3,City,State,Zip,Country,OEL,PalletNum,PckgNum,IMBCode,ProcDate
"M394993","M39499300010000001","0GH","3GH000503","PDP","BO","1011250","MEBB04CB","Name","","Name","address","","","City","SC","29170-2043","","*******AUTO**SCH 5-DIGIT 29033","P1","PK05","2031100094470495539729170204309","3GH000503","August 26, 2013"
"M394993","M39499300010000002","0GH","3GH000503","PDP","BO","1011572","MEBB07GB","Name","G","Name","address","","","City","SC","29020-2912","","*********AUTO**SCH 3-DIGIT 290","P1","PK01","3031100094470495580529020291210","3GH000503","August 26, 2013"
"M394993","M39499300010000003","0GH","3GH000503","PDP","BO","1011693","MEBB08MP","Name","B","Name","address","","","City","SC","29061-9447","","*********AUTO**SCH 3-DIGIT 290","P1","PK02","3031100094470495583729061944757","3GH000503","August 26, 2013"
"M394993","M39499300010000004","0GH","3GH000503","PDP","BO","1011751","MEBB097M","Name","A","Name","address","","","City","SC","29645-0433","","*************AUTO**3-DIGIT 296","P1","PK31","3031100094470495629629645043333","3GH000503","August 26, 2013"
"M394993","M39499300010000005","0GH","3GH000503","PDP","BO","1012075","MEBB0K4L","Name","E","Name","address","","","City","SC","29682-9634","","*************AUTO**3-DIGIT 296","P1","PK33","3031100094470495637929682963428","3GH000503","August 26, 2013"
"M394993","M39499300010000006","0GH","3GH000503","PDP","BO","1012437","MEBB0TWQ","Name","R","Name","address","","","City","SC","29505-3030","","*******AUTO**SCH 5-DIGIT 29501","P1","PK24","2031100094470495556429505303050","3GH000503","August 26, 2013"
"M394993","M39499300010000007","0GH","3GH000503","PDP","BO","1012750","MEBB0YJY","Name","L","Name","address","","","City","SC","29642-3006","","***********AUTO**5-DIGIT 29642","P1","PK38","2031100094470495567529642300601","3GH000503","August 26, 2013"

P1 字段可能是 P100,所以我需要将 P1 填充到 P001。但实际上它只需要是最大长度。我可以在两个字段上对文件进行排序,但不确定如何填充它们。

在此先感谢您的帮助。

4

1 回答 1

1

好的,因为没有其他内容即将发布,这里有一个快速的 python(2.x 或 3.x)脚本,可以满足您的需要:

import sys
import csv

reader = csv.reader(sys.stdin)
writer = csv.writer(sys.stdout, quoting=csv.QUOTE_ALL)

rows = [row for row in reader]
max_len = max([len(row[20]) for row in rows[1:]])

writer.writerow(rows[0])
for row in rows[1:]:
    while len(row[20]) < max_len:
        row[20] = 'PK0' + row[20][2:]
    writer.writerow(row)

如果您将其另存为 ,pad.py那么您可以像这样使用它:

$ cat /path/to/my_csv_file.csv | python /path/to/pad.py > /path/to/my_new_csv_file.csv

并将my_new_csv_file.csv以您需要的格式创建。由于脚本作用于stdin并输出到stdout,因此您可以以多种不同的方式使用它来满足您的目的。

希望这可以帮助。

于 2013-08-29T01:18:01.597 回答