2

我有两个脚本在 csv 中创建新列,每个脚本都打开 csv 并附加一个新列。理想情况下,不是将 csv 保存到 csv1 然后打开 csv1 并将其重新保存为 csv2 我希望能够一步完成。

脚本1

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table exists"
         else:
             output_lines[-1]+=",No table found"

with open("outputcsv1.csv", "w") as output_file:
    output_file.write("\n".join(output_lines))   

脚本2

with open("outputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header += ",Are you sure Table exists?"
    output_lines = [header]

    for line in input_file:
         output_lines.append(line[:-1])
         if 'table' in line.split(",")[3]:
             output_lines[-1]+=",table definitely exists"
         else:
             output_lines[-1]+=",No table was not found"

with open("outputcsv2.csv", "w") as output_file:
   output_file.write("\n".join(output_lines))   

以上两个脚本是在一个非常简单的示例 csv 中使用的脚本。

示例输入csv1.csv

title1,title2,title3,Table or no table?,title4
data,text,data,the cat sits on the table,text,data
data,text,data,tables are made of wood,text,data
data,text,data,the cat sits on the television,text,data
data,text,data,the dog chewed the table leg,text,data
data,text,data,random string of words,text,data
data,text,data,table seats 25 people,text,data
data,text,data,I have no idea why I made this example about tables,text,data
data,text,data,,text,data

所需的输出csv:

title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure Table exist
data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
data,text,data,tables are made of wood,text,data,table exists,table definitely exists
data,text,data,the cat sits on the television,text,data,No table found,No table was not found
data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
data,text,data,random string of words,text,data,No table found,No table was not found
data,text,data,table seats 25 people,text,data,table exists,table definitely exists
data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
data,text,data,,text,data,No table found,No table was not found

为了合并这两个脚本,我尝试了以下代码:

with open("inputcsv1.csv", "r") as input_file:
    header = input_file.readline()[:-1] #this is to remove trailing '\n'
    header2 = input_file.readline()[:-2] #this is to remove trailing '\n'
    header += ",Table exists?"
    header2 += ",Are you sure table exists?"
    output_lines = [header]
    output_lines2 = [header2]

    for line in input_file:
        output_lines.append(line[:-1])
        if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table exists"
        else:
            output_lines[-1]+=",No table found"

    for line in input_file:
        output_lines.append(line[:-2])
        if 'table' in line.split(",")[3]:
            output_lines2[-2]+=",table definitely exists"
        else:
            output_lines2[-2]+=",No table was not found"

with open("TestMurgedOutput.csv", "w") as output_file:
    output_file.write("\n".join(output_lines).join(output_lines2))

它不会产生错误,但只会在新的 csv 中输出以下内容。

data,text,data,the cat sits on the table,text,dat,Are you sure table exists?

我不确定为什么,尽管我对自己使用.join. 任何建设性意见将不胜感激。

4

2 回答 2

3

我认为这与您正在寻找的内容很接近——这就是我将if两个脚本中的语句放在一个for循环中的意思。它可以进行优化,但我尽量保持简单,以便您可以轻松了解正在做什么。

with open("inputcsv1.csv", "rt") as input_file:
    header = input_file.readline()[:-1]  # remove trailing newline
    # add a title to the header for each of the two new columns
    header += ",Table exists?,Are you sure table exists?"
    output_lines = [header]

    for line in input_file:
        line = line[:-1]  # remove trailing newline
        cols = line.split(',')  # split line in columns based on delimiter
        # add first column
        if 'table' in cols[3]:
            line += ",table exists"
        else:
            line += ",No table found"
        # add second column
        if 'table' in cols[3]:
            line += ",table definitely exists"
        else:
            line += ",No table was not found"
        output_lines.append(line)

with open("TestMurgedOutput.csv", "wt") as output_file:
    output_file.write("\n".join(output_lines))

创建的TestMurgedOutput.csv文件内容:

title1,title2,title3,Table or no table?,title4,Table exists?,Are you sure table exists?
data,text,data,the cat sits on the table,text,data,table exists,table definitely exists
data,text,data,tables are made of wood,text,data,table exists,table definitely exists
data,text,data,the cat sits on the television,text,data,No table found,No table was not found
data,text,data,the dog chewed the table leg,text,data,table exists,table definitely exists
data,text,data,random string of words,text,data,No table found,No table was not found
data,text,data,table seats 25 people,text,data,table exists,table definitely exists
data,text,data,I have no idea why I made this example about tables,text,data,table exists,table definitely exists
data,text,data,,text,data,No table found,No table was not found
于 2013-10-29T23:55:11.293 回答
0

您的 output_lines2 列表仅包含一个元素(因为文件中的所有行都是在第一个 for 循环中读取的),因此 join 对其没有影响,并且 write 语句输出 output_lines2 列表的单个元素。尝试这个:

with open("test.csv", "r") as input_file:
header = input_file.readline()[:-1] #this is to remove trailing '\n'
header += ",Table exists?"
header += ",Are you sure Table exists?"
output_lines = [header]
for line in input_file:
     output_lines.append(line[:-1])
     if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table exists"
     else:
            output_lines[-1]+=",No table found"
     if 'table' in line.split(",")[3]:
            output_lines[-1]+=",table definitely exists"
     else:
            output_lines[-1]+=",No table was not found"
with open("output.csv", "w") as output_file:
output_file.write("\n".join(output_lines))
于 2013-10-29T22:47:06.790 回答