嗨,我有一张看起来像的桌子
-----------------------------------------------------------
| id | group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
| 2 | 1 | 2 | 4 | 1 |
-----------------------------------------------------------
| 4 | 1 | 20 | 2 | 1 |
-----------------------------------------------------------
| 5 | 1 | 2 | 14 | 1 |
-----------------------------------------------------------
| 7 | 1 | 2 | 7 | 3 |
-----------------------------------------------------------
| 20 | 2 | 20 | 4 | 3 |
-----------------------------------------------------------
| 21 | 2 | 20 | 4 | 1 |
-----------------------------------------------------------
设想
有两种情况需要处理。
Sortsequence
source_id
列值对于 1和应该是唯一的group_id
。例如,如果所有记录都group_id = 1 AND source_id = 2
应该具有唯一的排序序列。在上面的示例记录中具有id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1
. 这是错误的记录。我需要找出这些记录。- 如果
group_id and source_id
一样。sortsequence columns value should be continous. There should be no gap
。_ 例如上表records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1
。即使这是唯一的,但 sortsequence 值也存在差距。我还需要找出这些记录。
我迄今为止的努力
我写了一个查询
SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children
FROM
table
GROUP BY source_id,
sortsequence,
`group_id`
HAVING COUNT(*) > 1
此查询仅针对场景 1。如何处理场景 2?有什么方法可以在同一个查询中执行此操作,或者我必须编写其他方法来处理第二种情况。
By the way query will be dealing with million of records in table so performance must be very good.