0

我有一个项目,每周我都会导入几个包含不正确数据的大型数据集,例如重复的员工 ID,它们不应该重复。为了标记重复项,我尝试了以下代码:

ALTER TABLE AccountDuplicates
  ADD UNIQUE INDEX EmployeeID (EmployeeID);

INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork
  ON DUPLICATE KEY UPDATE
    EmployeeID = CONCAT(VALUES(EmployeeID), '*');

INSERT 语句给了我错误,我看不出我做错了什么:

[42000][1064] You have an error in your SQL syntax; check the manual that correspondsto your MySQL server version for the right syntax to use near 'FROM EAD_UserAccountsWork
           ON DUPLICATE KEY UPDATE EmployeeID = CONCAT(VALUES(E' at line 36

如果相关,我在 OS X 10.11.4、INNODB 引擎和 mysql_mode = '' 上运行 MySQL 5.7.12。我的目的是识别重复的 ID,以便我可以将它们转发给适当的 DBA 进行更正。

更新:我已将数据库默认设置如下:

[client]
default-character-set = utf8mb4

[mysqld]
sql_mode=''
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

[mysql]
default-character-set = utf8mb4
4

2 回答 2

0

你的语法看起来不错,也许是打字错误?这几乎可以工作:

-- drop table AccountsWork ;
-- drop table  AccountDuplicates;
CREATE TABLE AccountsWork (
    EmployeeID varchar(16),
    FirstName INT,
    LastName INT
);
CREATE TABLE AccountDuplicates (
    EmployeeID  varchar(16),
    FirstName INT,
    LastName INT
);
    alter table AccountDuplicates add unique index(EmployeeID); 
insert into AccountsWork values('a',2,3);
insert into AccountsWork values(1,2,3);
insert into AccountsWork values('b',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
insert into AccountsWork values('c',2,3);
SELECT 
    *
FROM
    AccountsWork;
-- there is no syntax errors here (your original query):    
INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork
  ON DUPLICATE KEY UPDATE
    EmployeeID =  CONCAT(VALUES(EmployeeID), '*');

SELECT 
    *
FROM
    AccountDuplicates;    

具有重复键的简单插入有效,但“在重复键更新时从表中选择”不起作用。

我认为您需要查看这个问题:INSERT INTO ... SELECT FROM ... ON DUPLICATE KEY UPDATE

看起来mysql解析器对这种查询发疯了

于 2016-04-24T19:35:52.593 回答
0

我认为这是因为您没有正确限定“来源”EmployeeID

INSERT INTO AccountDuplicates
  SELECT
    EmployeeID,
    FirstName,
    LastName
  FROM AccountsWork t
  ON DUPLICATE KEY UPDATE
    EmployeeID = CONCAT(t.EmployeeID, '*');

EmployeeID位于左边的是=AccountDuplicates,右边的是AccountsWork

于 2016-04-24T19:28:50.367 回答