0

如果我通过向您展示我正在尝试操作的 .csv 文件进行解释,可能会更容易:

https://www.dropbox.com/s/4kms4hm28y7sv8w/Test1.csv

我有成百上千行这样的数据,但我们决定我们希望它采用不同的格式,每个化石属和物种(W、X、Y 列)都排成一行。

我对 Python 的了解非常有限,但我想尝试使用它来拆分这些单元格并将每个值插入到下面的行中。我打算手动将它们拖到正确的列,然后在 Excel 上拖下其他详细信息。

编码:

#nektonic=[row[22].split(',') for row in data]
#infaunal=[row[23].split(',') for row in data]  
#epifaunal=[row[24].split(',') for row in data] 

f=0
r=0

def splitfossils(f, r): 
    #f=0 #fossil index: counter that moves the selection along the fossils in a cell that are being split by commas
    for row in data:
        r=(data.index(row)+1) #row index: counter so that split fossils can be inserted beneath the row that is being processed; the +1 is to ensure that the counter starts on 1, not 0.
        if row[22] == '':
            continue #if no fossils are found, move onto the next row
        else:
            nektonic=[row[22].split(',')] #nektonic fossils are found to be in the 23rd column of the spreadsheet
            if len(nektonic) == 1:
                data.insert(r,(nektonic[f])) #if only one fossil is present in the nektonic list, insert only that fossil and do not increase counter number 
            else:
                while f < len(nektonic): #the while loop will loop until the split fossils have been processed
                    data.insert(r,(nektonic[f])) #each split fossil will be inserted into a row below                                   
                    f=f+1 #the fossil index moves on to the next fossil
                    r=r+1 #the next fossil will be inserted into the row below the previous fossil
                    return f
                    return r


splitfossils(f, r)

当前的错误消息是列表索引超出范围(突出显示第 19 和 34 行)。

我尝试通过函数传递各种变量来玩一会儿,看看这是否有所不同,但我之前遇到的错误是“for”循环不会迭代。“数据”列表的长度是 29,但我会打印出 nektonic[f] 的唯一打印件是“Stomohamites Simplex”,这是电子表格中 1W 的唯一值。

我不确定循环中的所有这些循环是否会起作用,就像我说的那样,我的知识非常基础。谁能告诉我代码有什么问题以及解决这个问题的更简单方法是什么?

谢谢

编辑:我完成了改变我所做的方法。它现在可以工作了,非常感谢您的所有帮助。

import csv

out=open("Test1.csv", "rb")
data=csv.reader(out)
data=[row for row in data]
out.close() 

nektonic=[]

def splitfossils(): 
    for row in data:        
        nektonic=row[22].split(',')
        if len(nektonic)>1:
            for fossil in nektonic:
                newrow=[0 for i in range(22)]
                newrow.append(fossil)
                output.writerow(newrow)

        else:
            output.writerow(row)
    return data

out=open("new_test2.csv", "wb")
output=csv.writer(out)
splitfossils()
4

2 回答 2

4

在 Python 中,身份很重要。因此,代码

while f < len(nektonic): #the while loop will loop until the split fossils have been processed
    data.insert(r,(nektonic[f])) #each split fossil will be inserted into a row below                                   
    f=f+1 #the fossil index moves on to the next fossil
    r=r+1 #the next fossil will be inserted into the row below the previous fossil
    return f
    return r

单次迭代后返回,因为return f立即被击中。您可能打算将其缩进一点(returns实际上都是)。

话虽如此,在 Python 中,您不需要使用索引来迭代数组,您只需执行以下操作:

for fossil in nektonic:
    data.insert(r, fossil)

对于迭代行的外部循环也是如此。

于 2013-08-02T13:43:38.637 回答
0

问题是您正在尝试修改您正在迭代的列表。我认为这不是 Python 中的好方法。尝试将您的数据复制到新列表中(因为对象被引用而不是复制,所以内存效率很高)。像这样的东西:

import csv

out=open("Test1.csv", "rb")
data=csv.reader(out)
data=[row for row in data]
out.close()    

#nektonic=[row[22].split(',') for row in data]
#infaunal=[row[23].split(',') for row in data] 
#epifaunal=[row[24].split(',') for row in data]

def splitfossils():
    result = []
    for row in data:
        if row[22] == '':
            continue #if no fossils are found, move onto the next row
        else:
            nektonic=[row[22].split(',')]
            result.append(row)
            result.append(nektonic)
    return result


print splitfossils()

我不确定上面的代码是否是您问题的直接答案,但请以这种方式尝试...

于 2013-08-02T13:50:00.010 回答