6

我有一个数据库,用于存储有关我从外部来源提取的游戏的不同比赛的信息。由于一些问题,数据库中偶尔会出现缺口(可能是 1 个缺失 ID 到几百个)。我想让程序提取丢失游戏的数据,但我需要先获取该列表。

这是表格的格式:

id (pk-identity)  |  GameID (int)  |  etc.  |  etc.  

我曾想过编写一个程序来运行循环并从 1 开始查询每个 GameID,但似乎应该有一种更有效的方法来获取丢失的数字。

有没有一种简单有效的方法,使用 SQL Server 从范围中查找所有缺失的数字?

4

6 回答 6

7

这个想法是看看差距从哪里开始。让我假设您使用的是 SQL Server 2012,并且有lag()lead()函数。以下得到下一个id

select t.*, lead(id) over (order by id) as nextid
from t;

如果有差距,那么nextid <> id+1。您现在可以使用 来表征差距where

select id+1 as FirstMissingId, nextid - 1 as LastMissingId
from (select t.*, lead(id) over (order by id) as nextid
      from t
     ) t
where nextid <> id+1;

编辑:

如果没有lead(),我会对相关子查询做同样的事情:

select id+1 as FirstMissingId, nextid - 1 as LastMissingId
from (select t.*,
             (select top 1 id
              from t t2
              where t2.id > t.id
              order by t2.id
             ) as nextid
      from t
     ) t
where nextid <> id+1;

假设id是表上的主键(或者甚至它只有一个索引),这两种方法都应该具有合理的性能。

于 2013-09-09T13:22:42.560 回答
1

数字表!

CREATE TABLE dbo.numbers (
   number int NOT NULL
)

ALTER TABLE dbo.numbers
ADD
   CONSTRAINT pk_numbers PRIMARY KEY CLUSTERED (number)
     WITH FILLFACTOR = 100
GO

INSERT INTO dbo.numbers (number)
SELECT (a.number * 256) + b.number As number
FROM     (
        SELECT number
        FROM   master..spt_values
        WHERE  type = 'P'
        AND    number <= 255
       ) As a
 CROSS
  JOIN (
        SELECT number
        FROM   master..spt_values
        WHERE  type = 'P'
        AND    number <= 255
       ) As b
GO

然后您可以在两个表之间执行OUTER JOINor EXISTS` 并找到差距...

SELECT *
FROM   dbo.numbers
WHERE  NOT EXISTS (
         SELECT *
         FROM   your_table
         WHERE  id = numbers.number
       )

-- OR

SELECT *
FROM   dbo.numbers
 LEFT
  JOIN your_table
    ON your_table.id = numbers.number
WHERE  your_table.id IS NULL
于 2013-09-09T13:22:47.440 回答
1

我喜欢“差距和岛屿”的方法。它有点像这样:

WITH Islands AS (
    SELECT GameId, GameID - ROW_NUMBER() OVER (ORDER BY GameID) AS [IslandID]
    FROM dbo.yourTable
)
SELECT MIN(GameID), MAX(Game_id)
FROM Islands
GROUP BY IslandID

该查询将为您提供连续范围的列表。从那里,您可以自行加入该结果集(在连续的 IslandID 上)以获取差距。不过,要让 IslandID 本身是连续的,还有一些工作要做。因此,扩展上述查询:

WITH 
cte1 AS (
    SELECT GameId, GameId - ROW_NUMBER() OVER (ORDER BY GameId) AS [rn]
    FROM dbo.yourTable
)
, cte2 AS (
    SELECT [rn], MIN(GameId) AS [Start], MAX(GameId) AS [End]
    FROM cte1
    GROUP BY [rn]
)
,Islands AS (
    SELECT ROW_NUMBER() OVER (ORDER BY [rn]) AS IslandId, [Start], [End]
  from cte2
)

SELECT a.[End] + 1 AS [GapStart], b.[Start] - 1 AS [GapEnd]
FROM Islands AS a
LEFT JOIN Islands AS b
    ON a.IslandID + 1 = b.IslandID
于 2013-09-09T16:31:48.890 回答
1
     SELECT * FROM #tab1
            id          col1
            ----------- --------------------
            1           a
            2           a
            3           a
            8           a
            9           a
            10          a
            11          a
            15          a
            16          a
            17          a
            18          a

 WITH cte (id,nextId) as
                (SELECT t.id, (SELECT TOP 1 t1.id FROM #tab1 t1 WHERE t1.id > t.id) AS nextId  FROM #tab1 t)

 SELECT id AS 'GapStart', nextId AS 'GapEnd' FROM cte
                WHERE id + 1 <> nextId

    GapStart    GapEnd
    ----------- -----------
    3           8
    11          15
于 2017-03-17T13:01:08.253 回答
0

试试这个(这涵盖了从 1 开始的多达 10000 个 ID,如果您需要更多,可以在下面的数字表中添加更多):

;WITH Digits AS (
    select Digit 
    from ( values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) as t(Digit))
,Numbers AS (
    select u.Digit 
          + t.Digit*10 
          + h.Digit*100 
          + th.Digit*1000
          + tth.Digit*10000 
          --Add 10000, 100000 multipliers if required here.
          as myId
    from Digits u
    cross join Digits t
    cross join Digits h
    cross join Digits th
    cross join Digits tth
    --Add the cross join for higher numbers 
    )
SELECT myId 
FROM Numbers
WHERE myId NOT IN (SELECT GameId FROM YourTable)
于 2013-09-09T14:02:52.037 回答
0

问题:我们需要在id字段中找到间隙范围

SELECT * FROM #tab1

id          col1
----------- --------------------
1           a  
2           a  
3           a  
8           a    
9           a  
10          a  
11          a  
15          a  
16          a  
17          a  
18          a

解决方案

WITH cte (id,nextId) as
(SELECT t.id, (SELECT TOP 1 t1.id FROM #tab1 t1 WHERE t1.id > t.id) AS nextId  FROM #tab1 t)

SELECT id + 1, nextId - 1 FROM cte
WHERE id + 1 <> nextId

输出

GapStart    GapEnd
----------- -----------
4           7
12          14
于 2017-03-17T12:43:24.527 回答