1

我有一个文本文件,我从中解析一列数据,结果是一个大列表(50 个元素):

CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES, ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL, EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ, DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG, QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP

现在,在该列表中的每 10 个元素之后,我想要一个新行。所以我想接近它的方式是在每 10 个逗号将列表分成一个新行之后,这是我的方法:

import csv
import re

filename = input("Please enter file name to extract data from: ")
with open(filename) as f:
    next(f)
    data = f.readlines()

my_list2 = []
ticker_list = []
for line in data:
    my_list = line.split()
    my_list2.append(my_list[1])

for item in my_list2:
    ticker_list = ', '.join(my_list2)

count = 0
for item in ticker_list:
    if item == ",":
        count += 1
    if count == 10:
        ticker_list = [i.split('\n')[0] for i in ticker_list]

print (ticker_list)

##with open("ticker_data.txt", "w") as file:
##    file.write(', '.join(ticker_list))

但这似乎不起作用,是否有人为我提供解决方案,可以在 txt 文件中给我这个结果:

CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES, 
ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL, 
EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ, 
DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG, 
QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP

谢谢,顺便说一句,我正在使用Python 3..

4

3 回答 3

1

Ok Using a file called rawdata.txt that looks like this:

CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES, ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL, EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ, DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG, QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP

Here is a script that reads each line and splits it into rows wih to more than 10 symbols per row

import csv

with open('rawdata.txt') as f:
    with open('ticker_data.csv', 'wb') as csvfile:
        writer = csv.writer(csvfile)
        for line in f.readlines():
            data = line.split(', ')
            chunks=[data[x:x+10] for x in xrange(0, len(data), 10)]
            for chunk in chunks:
                writer.writerow(chunk)

Which produces a file with this in it:

CLB,HNRG,LPI,MTDR,MVO,NRGY,PSE,PVR,RRC,WES
ACMP,ATLS,ATW,BP,BWP,COG,DGAS,DNR,EPB,EPL
EXLP,NOV,OIS,PNRG,SEP,APL,ARP,CVX,DMLP,DRQ
DWSN,EC,ECA,FTI,GLOG,IMO,LINE,NFX,OILT,PNG
QRE,RGP,RRMS,SDRL,SNP,TLP,VNR,XOM,XTXI,AHGP
于 2013-06-28T23:41:26.647 回答
0

你可以这样做:

import csv
from itertools import izip_longest

with open('/tmp/line.csv','r') as fin:
    cr=csv.reader(fin)
    n=10
    data=izip_longest(*[iter(list(cr)[0])]*n,fillvalue='')
    print '\n'.join(', '.join(t) for t in data)

使用您的数据,打印:

CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES
ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL
EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ
DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG
QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP

编辑

随着澄清(Py 3)

我会这样写你的程序:

import csv
from itertools import zip_longest

n=10
with open('/tmp/rawdata.txt','r') as fin, open('/tmp/out.csv','w') as fout:
    reader=csv.reader(fin)
    writer=csv.writer(fout) 
    source=(e for line in reader for e in line)             
    for t in zip_longest(*[source]*n):
        writer.writerow(list(e for e in t if e))

变化:

  1. 输出到文件;
  2. 元素的来源是生成器;
  3. 无论每行有多少行或逗号分隔的元素,源都会逐项处理(受 csv/元素考虑);
  4. 不管是什么n,输出都是n很长的元素,直到最后一位 < n
于 2013-06-29T00:39:48.090 回答
0

另一种选择是使用切片和 xrange:

import csv
writer = csv.writer(open("output.txt", "w"))

for x in xrange(0,len(ticker_list),10):
    writer.writerow(ticker_list[x:x+10])

xrange给我们从 0 到步长为 10 的列表长度之间的数字,然后我们打印出一个长度为 10 的切片,从这些索引中的每一个开始到csvfile. csv.writer将负责添加逗号分隔符等。

于 2013-06-28T22:41:02.713 回答