2

I have a python script using executemany to bulk insert rows into a MySQL table. The data is retrieved from different APIs, so every now and then there is unexpected data which leads to a row causing exception.

If I understand correctly - when calling executemany with 1,000 rows and one of them is problematic - the entire bulk is not inserted.

I want to find a way to submit 1,000 records and successfully load the ones that are not problematic. So for example - if one of a thousand is problematic it will not be loaded, but all other 999 will be loaded.

What's the best practice on that? I'm thinking of catching an exception and creating a fallback to re-submit all 1000 one by one - but it seems like there must be a better way to achieve the same outcome.

Advice?

4

2 回答 2

4

在“executemany”查询的开头执行“INSERT OR IGNORE”语句将使您完全做到这一点 - 它只会添加不会带来错误的值。

唯一的缺点是您再也看不到发生了什么错误。例如,

原始数据库:

('kaushik', 3)
('maria', 4)
('shreya', 38)

查询:(在python中)

listofnames = [
('kaushik', 3),
('maria', 4),
('jane', 56)
]

c.executemany("INSERT OR IGNORE INTO bob (name, number) VALUES (?,?)", 
listofnames)

最终分贝:

('kaushik', 3)
('maria', 4)
('shreya', 38)
('jane', 56)
于 2020-07-22T17:41:48.940 回答
0

插入时,将executemany所有数据行批处理在一起,并尝试使用一个命令将它们全部插入。据我所知,没有办法在不破坏整批插入的情况下处理一次失败插入引发的异常。如果一行失败,则整个命令失败。

这是它的样子(示例取自MySQL 文档)。如果你告诉它这样做:

data = [
  ('Jane', date(2005, 2, 12)),
  ('Joe', date(2006, 5, 23)),
  ('John', date(2010, 10, 3)),
]
stmt = "INSERT INTO employees (first_name, hire_date) VALUES (%s, %s)"
cursor.executemany(stmt, data)

executemany会这样做:

INSERT INTO employees (first_name, hire_date)
VALUES ('Jane', '2005-02-12'), ('Joe', '2006-05-23'), ('John', '2010-10-03')

如果您认为这将是一种罕见的情况,那么您单独重试每个插入的想法将会奏效。就像是:

try:
    cursor.executemany(stmt, data)
except ___Error:  # fill in the blank
    for datum in data:
        try:
            cursor.execute(stmt, datum)
        except ___Error:
            # handle exception, eg print warning
            ...

executemany如果您认为这将是一个常见问题,那么放弃并执行以下操作可能会更高效:

for datum in data:
    try:
        cursor.execute(stmt, datum)
    except ___Error:
        # handle exception, eg print warning
        ...
于 2019-10-14T21:45:55.563 回答