6

我有关于获取数据库表中不存在的日期的查询。

我在数据库中有以下日期。

2013-08-02
2013-08-02
2013-08-02
2013-08-03
2013-08-05
2013-08-08
2013-08-08
2013-08-09
2013-08-10
2013-08-13
2013-08-13
2013-08-13

我想要如下预期的结果,

2013-08-01
2013-08-04
2013-08-06
2013-08-07
2013-08-11
2013-08-12

如您所见,结果有六个日期不存在于数据库中,

我试过下面的查询

SELECT
    DISTINCT DATE(w1.start_date) + INTERVAL 1 DAY AS missing_date
FROM
    working w1
LEFT JOIN
    (SELECT DISTINCT start_date FROM working ) w2 ON DATE(w1.start_date) = DATE(w2.start_date) - INTERVAL 1 DAY
WHERE
    w1.start_date BETWEEN '2013-08-01' AND '2013-08-13'
AND
    w2.start_date IS NULL;

但以上返回以下结果。

2013-08-04
2013-08-14
2013-08-11
2013-08-06

正如您所看到的,它从不需要 14 个日期中返回四个日期,但由于左连接,它仍然不包含 3 个日期。

现在请查看我的查询,让我知道我能做到这一点的最佳方法是什么?

感谢您的关注和时间。

4

6 回答 6

17

我想您总是可以生成日期序列,然后只使用 aNOT IN来消除实际存在的日期。这将在 1024 天范围内达到最大值,但很容易缩小或扩展,日期列称为“mydate”,位于表“table1”中;

SELECT * FROM (
  SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) day 
  FROM 
   (SELECT 0 t4    UNION ALL SELECT 1   UNION ALL SELECT 2   UNION ALL SELECT 3  ) t4,
   (SELECT 0 t16   UNION ALL SELECT 4   UNION ALL SELECT 8   UNION ALL SELECT 12 ) t16,   
   (SELECT 0 t64   UNION ALL SELECT 16  UNION ALL SELECT 32  UNION ALL SELECT 48 ) t64,      
   (SELECT 0 t256  UNION ALL SELECT 64  UNION ALL SELECT 128 UNION ALL SELECT 192) t256,     
   (SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024     
  ) b 
WHERE day NOT IN (SELECT mydate FROM Table1) AND day<'2013-08-13';

从“如果它没有关闭,我会添加一个 SQLfiddle”部门。

感谢您的帮助,这是我最终得到的查询及其工作

SELECT * FROM
(
    SELECT DATE_ADD('2013-08-01', INTERVAL t4+t16+t64+t256+t1024 DAY) missingDates 
        FROM 
    (SELECT 0 t4    UNION ALL SELECT 1   UNION ALL SELECT 2   UNION ALL SELECT 3  ) t4,
    (SELECT 0 t16   UNION ALL SELECT 4   UNION ALL SELECT 8   UNION ALL SELECT 12 ) t16,   
    (SELECT 0 t64   UNION ALL SELECT 16  UNION ALL SELECT 32  UNION ALL SELECT 48 ) t64,      
    (SELECT 0 t256  UNION ALL SELECT 64  UNION ALL SELECT 128 UNION ALL SELECT 192) t256,     
    (SELECT 0 t1024 UNION ALL SELECT 256 UNION ALL SELECT 512 UNION ALL SELECT 768) t1024     
) b 
WHERE
    missingDates NOT IN (SELECT DATE_FORMAT(start_date,'%Y-%m-%d')
            FROM
                working GROUP BY start_date)
    AND
    missingDates < '2013-08-13';
于 2013-08-13T17:38:27.660 回答
3

我的赌注可能是创建一个专用Calendar表,以便能够在LEFT JOIN.

您可以根据需要创建表,但由于它不会代表如此大量的数据,因此最简单且可能最有效的方法是一次性创建它,就像我在下面使用存储过程所做的那样:

--
-- Create a dedicated "Calendar" table
--
CREATE TABLE Calendar (day DATE PRIMARY KEY);

DELIMITER //
CREATE PROCEDURE init_calendar(IN pStart DATE, IN pEnd DATE)
BEGIN
    SET @theDate := pStart;
    REPEAT
        -- Here I use *IGNORE* in order to be able
        -- to call init_calendar again for extend the
        -- "calendar range" without to bother with
        -- "overlapping" dates
        INSERT IGNORE INTO Calendar VALUES (@theDate);
        SET @theDate := @theDate + INTERVAL 1 DAY;
    UNTIL @theDate > pEnd END REPEAT;
END; //
DELIMITER ;

CALL init_calendar('2010-01-01','2015-12-31');

在此示例中,日历连续保存 2191 天,粗略估计不到 15KB。并且存储 21 世纪的所有日期将代表不到 300KB...

现在,这是您在问题中描述的实际数据表:

--
-- *Your* actual data table
--
CREATE TABLE tbl (theDate DATE);
INSERT INTO tbl VALUES 
    ('2013-08-02'),
    ('2013-08-02'),
    ('2013-08-02'),
    ('2013-08-03'),
    ('2013-08-05'),
    ('2013-08-08'),
    ('2013-08-08'),
    ('2013-08-09'),
    ('2013-08-10'),
    ('2013-08-13'),
    ('2013-08-13'),
    ('2013-08-13');

最后是查询:

--
-- Now the query to find date not "in range"
--

SET @start = '2013-08-01';
SET @end = '2013-08-13';

SELECT Calendar.day FROM Calendar LEFT JOIN tbl
    ON Calendar.day = tbl.theDate
    WHERE Calendar.day BETWEEN @start AND @end
    AND tbl.theDate IS NULL;

生产:

+------------+
| day        |
+------------+
| 2013-08-01 |
| 2013-08-04 |
| 2013-08-06 |
| 2013-08-07 |
| 2013-08-11 |
| 2013-08-12 |
+------------+
于 2013-08-13T13:22:15.823 回答
2

我会这样做:

$db_dates = array (
'2013-08-02',
'2013-08-03',
'2013-08-05',
'2013-08-08',
'2013-08-09',
'2013-08-10',
'2013-08-13'
);
$missing = array();
$month = "08";
$year = "2013";
$day_start = 1;
$day_end = 14
for ($i=$day_start; $i<$day_end; $i++) {
    $day = $i;
    if ($i<10) {
        $day = "0".$i;  
    }
    $check_date = $year."-".$month."-".$day;
    if (!in_array($check_date, $db_dates)) {
        array_push($missing, $check_date);  
    }
}
print_r($missing);

我只是到那个间隔,但你可以定义另一个间隔或让它工作一整年。

于 2013-08-13T11:28:17.217 回答
0

在数据仓库类型的情况下,我解决这个问题的方法是在一个“静态”表中填充一个适当时期的日期(这种类型的东西有示例脚本,很容易谷歌搜索),然后left outer join或者right outer join你的表到它: 没有匹配的行是缺失的日期。

于 2013-08-13T11:29:53.290 回答
0
DECLARE @date date;
declare @dt_cnt int = 0;
set @date='2014-11-1';
while @date < '2014-12-31'
begin
  select @dt_cnt = COUNT(att_id) from date_table where att_date=@date ;

      if(@dt_cnt = 0) 
      BEGIN
         print @date
      END
      set @date = DATEADD(day,1,@date);
end
于 2015-03-23T07:20:05.247 回答
0

如果有人想要超过 1024 天(或几个小时),我会将其添加到 Dipesh 的出色答案中。从 2015 年到 2046 年,我生成了低于 279936 小时:

    SELECT 
DATE_ADD('2015-01-01', INTERVAL 
POWER(6,6)*t6 + POWER(6,5)*t5 + POWER(6,4)*t4 + POWER(6,3)*t3 + POWER(6,2)*t2 + 
POWER(6,1)*t1 + t0 
HOUR) AS period
FROM
 (SELECT 0 t0 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t0,
 (SELECT 0 t1 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t1,
 (SELECT 0 t2 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t2,
 (SELECT 0 t3 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t3,
 (SELECT 0 t4 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t4,
 (SELECT 0 t5 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t5,
 (SELECT 0 t6 UNION SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION SELECT 4 UNION SELECT 5) t6
 ORDER BY period

只需将其插入答案查询即可。

于 2015-09-28T11:25:11.097 回答