6

I'm using Python and its MySQLdb module to import some measurement data into a Mysql database. The amount of data that we have is quite high (currently about ~250 MB of csv files and plenty of more to come).

Currently I use cursor.execute(...) to import some metadata. This isn't problematic as there are only a few entries for these.

The problem is that when I try to use cursor.executemany() to import larger quantities of the actual measurement data, MySQLdb raises a

TypeError: not all arguments converted during string formatting

My current code is

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into values (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()

where values is a list of tuples containing three strings each. Any ideas what could be wrong with this?

Edit:

The values are generated by

yield (prefix + row['id'], row['value'], sample_id)

and then read into a list one thousand at a time where row is and iterator coming from csv.DictReader.

4

2 回答 2

8

回想起来,这是一个非常愚蠢但很难发现的错误。Values 是 sql 中的关键字,因此表名 values 需要用引号引起来。

def __insert_values(self, values):
    cursor = self.connection.cursor()
    cursor.executemany("""
        insert into `values` (ensg, value, sampleid)
        values (%s, %s, %s)""", values)
    cursor.close()
于 2009-06-16T14:54:47.840 回答
3

您收到的消息表明在executemany()方法内部,其中一个转换失败。检查您的values列表中是否有超过 3 的元组。

快速验证:

max(map(len, values))

如果结果高于 3,请使用过滤器定位您的错误元组:

[t for t in values if len(t) != 3]

或者,如果您需要索引:

[(i,t) for i,t in enumerate(values) if len(t) != 3]
于 2009-06-10T11:02:20.623 回答