0

我正在编写一个程序,它将读取一个包含如下数据的 CSV 文件:

"10724_artifact11679.jpg","H. 3 1/4 in. (8.26 cm)","10.210.114","This artwork is currently on display in Gallery 171","11679"

并将其写入 HTML 表格。我只想要在第三个位置说“这件艺术品没有展出”的文件。但我一直遇到这组数据的问题

import csv

metlist4 = []

newList = csv.reader(open("v2img_10724_list.csv", 'r')) 
for row in newList:   
    metlist4.append(row)  

artifact_template = """<td>
    <div>
    <img src= "%(image)s" alt = "artifact" />
    <p>Dimensions: %(dimension)s </p>
    <p>Accession #: %(accession)s </p>
    <p>Display: %(display)s </p>
    <p>index2: %(index2)s </p>
    </div>
    </td>"""

html_list = []

count = 5794
for artifact in metlist4:
        if artifact[3] in ["This artwork is not on display"]:
                artifactinfo = {}
                artifactinfo["image"]=artifact[0]
                artifactinfo["dimension"]=artifact[1]
                artifactinfo["accession"]=artifact[2]
                artifactinfo["display"]=artifact[3]
                artifactinfo["index2"]=count    
                count = count + 1
                html_list.append(artifact_template % artifactinfo)
         else:
                pass

f = open("v3display_test.txt", "w")
f.write("\n".join(html_list))
f.close()  

我得到这个错误,但只有当我运行整个metlist4时......

  File "/Users/Rose/Documents/workspace/METProjectFOREAL/src/no_display_Met4.py", line 34, in <module>
    if artifact[3] in ["This artwork is not on display"]:
IndexError: list index out of range

如果我只运行一个部分,例如metlist4[0:500],则不会发生错误。任何想法或建议将不胜感激!!谢谢!

4

1 回答 1

3

至少一行没有第四个元素。也许这条线是空的。

测试长度,并打印要测试的行:

if len(artifact) < 4:
    print 'short row', artifact

如果是空行,就跳过它:

if not artifact: continue

您正在使用大量冗长和冗余的代码;当您可以直接循环csv.reader()对象时,无需构建单独的列表,也无需添加空else: pass块。

惯用的 Python 代码是:

artifact_template = """<td>
    <div>
    <img src= "%(image)s" alt = "artifact" />
    <p>Dimensions: %(dimension)s </p>
    <p>Accession #: %(accession)s </p>
    <p>Display: %(display)s </p>
    <p>index2: %(index2)s </p>
    </div>
    </td>"""

html_list = []

fields = 'image dimension accession display'.split()

with open("v2img_10724_list.csv", 'rb') as inputfile:
    reader = csv.DictReader(inputfile, fields=fields, restval='_ignored')
    for count, artifact in enumerate(reader, 5794):
         if artifact and artifact['display'] == "This artwork is not on display":
              artifactinfo["index2"] = count    
              html_list.append(artifact_template % artifact)

这使用 acsv.DictReader()来创建每行的字典,一个with确保文件在完成后关闭的语句,并enumerate()使用 start 值来跟踪count

于 2013-06-06T21:56:05.050 回答