0

我正在尝试使用重复记录清理数据库。我需要将引用移动到一条记录并删除另一条记录。

我有两个表:Promoters 和 Venues,每个表都引用了一个名为ities 的表。问题是有同名不同id的城市,与场地和发起人有关系。

通过这个查询,我可以将所有发起人和场地与一个城市记录分组:

SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids, GROUP_CONCAT( DISTINCT v.id ) as venues_ids
FROM cities as c
LEFT JOIN promoters as p ON p.city_id = c.id
LEFT JOIN venues as v ON v.city_id = c.id
WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 )
GROUP BY c.name

现在我想对发起人运行 UPDATE 查询,将 city_id 设置为上面查询的结果。

像这样的东西:

    UPDATE promoters AS pr SET pr.city_id = (
        SELECT ID
        FROM (
            SELECT c.id as id, c.name as name, GROUP_CONCAT( DISTINCT p.id ) as promoters_ids
            FROM cities as c
            LEFT JOIN promoters as p ON p.city_id = c.id

            WHERE c.name IN ( SELECT name from cities group by name having count(cities.name) > 1 ) AND pr.id IN promoters_ids
            GROUP BY c.name
            ) AS T1 

    )

我怎样才能做到这一点?

谢谢

4

1 回答 1

3

如果我理解正确,您想删除重复的城市(最终),因此您需要更新与您要在该过程中删除的任何城市相关联的发起人。

我认为使用任何同名城市的最低 ID 是有意义的(也可以是最高的,但我想至少指定它,不要让我自己决定。

因此,为了获得发起人的正确 ID,我需要: 选择与已链接到发起人的城市同名的所有城市中最低的 ID。

幸运的是,这种需求非常适合查询:

UPDATE promoters AS pr 
SET pr.city_id = (
  SELECT 
    -- Select the lowest ID ..
    Min(c.id)
  FROM
    -- .. of all cities ..
    Cities c
    -- .. that have the same name ..
    INNER JOIN Cities pc on pc.Name = c.Name
  WHERE
    .. as the city already linked to the promoter being updated
    pc.id = pr.city_id
  GROUP BY
    c.name)

诀窍是通过名称自行加入城市,因此您可以轻松获得所有具有相同名称的城市。我认为您对子句进行了相同的尝试IN,但这比它需要的要复杂一些。

我认为您根本不需要group_concat,除了检查内置查询是否确实返回了正确的城市,尽管这没有意义,因为您已经对名称进行了分组。当这样写时,你可以说这不可能出错:

  SELECT 
    -- Select the lowest ID ..
    MIN(c.id) AS id,
    GROUP_CONCAT(c.name) AS names --< already grouped by this, so why...
  FROM
    -- .. of all cities ..
    Cities c
    -- .. that have the same name.
    INNER JOIN Cities pc on pc.Name = c.Name
  GROUP BY
    c.name

我希望我正确理解了这个问题。

于 2013-08-08T07:56:33.967 回答