我在字段中有一个带有(除其他外)日期的表格。
我需要获取所有日期的列表,这些日期比最旧的日期更近,比最近的日期更早,并且完全从表中丢失。
因此,如果该表包含:
2012-01-02
2012-01-02
2012-01-03
2012-01-05
2012-01-05
2012-01-07
2012-01-08
我想要一个返回的查询:
2012-01-04
2012-01-06
我在字段中有一个带有(除其他外)日期的表格。
我需要获取所有日期的列表,这些日期比最旧的日期更近,比最近的日期更早,并且完全从表中丢失。
因此,如果该表包含:
2012-01-02
2012-01-02
2012-01-03
2012-01-05
2012-01-05
2012-01-07
2012-01-08
我想要一个返回的查询:
2012-01-04
2012-01-06
像这样的东西(假设你的表被命名your_table
并且日期列被命名the_date
):
with date_range as (
select min(the_date) as oldest,
max(the_date) as recent,
max(the_date) - min(the_date) as total_days
from your_table
),
all_dates as (
select oldest + level - 1 as a_date
from date_range
connect by level <= (select total_days from date_range)
)
select ad.a_date
from all_dates ad
left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;
编辑:
该WITH
子句称为“公用表表达式”,相当于派生表(“内联视图”)。
它类似于
select *
from (
.....
) all_dates
join your_table ...
第二个 CTE 使用 Oracle 实现的未记录特性简单地“即时”创建日期列表connect by
。
与使用派生表相比,重新使用选择(就像我计算第一个和最后一个日期一样)要容易一些(恕我直言,更具可读性)。
编辑2:
这也可以通过递归 CTE 来完成:
with date_range as (
select min(the_date) as oldest,
max(the_date) as recent,
max(the_date) - min(the_date) as total_days
from your_table
),
all_dates (a_date, lvl) as (
select oldest as a_date, 1 as lvl
from date_range
union all
select (select oldest from date_range) + lvl, lvl + 1
from all_dates
where lvl < (select total_days from date_range)
)
select ad.a_date, lvl
from all_dates ad
left join your_table yt on ad.a_date = yt.the_date
where yt.the_date is null
order by ad.a_date;
这应该适用于所有支持递归 CTE 的 DBMS(PostgreSQL 和 Firebird - 更符合标准 - 但确实需要recursive
关键字)。
注意递归部分的hack select (select oldest from date_range) + lvl, lvl + 1
。这应该不是必需的,但是 Oracle 在递归 CTE 中仍然存在一些关于 DATE 的错误。在 PostgreSQL 中,以下工作没有问题:
....
all_dates (a_date, lvl) as (
select oldest as a_date, 0 as lvl
from date_range
union all
select a_date + 1, lvl + 1
from all_dates
where lvl < (select total_days from date_range)
)
....
我会选择这个变体,因为它更有效:
with all_dates_wo_boundary_values as
( select oldest + level the_date
from ( select min(the_date) oldest
, max(the_date) recent
from your_table
)
connect by level <= recent - oldest - 1
)
select the_date
from all_dates_wo_boundary_values
minus
select the_date
from your_table
这里有一些证据。
首先是设置:
SQL> create table your_table (the_date)
2 as
3 select date '2012-01-02' from dual union all
4 select date '2012-01-02' from dual union all
5 select date '2012-01-03' from dual union all
6 select date '2012-01-05' from dual union all
7 select date '2012-01-05' from dual union all
8 select date '2012-01-07' from dual union all
9 select date '2012-01-08' from dual
10 /
Table created.
SQL> exec dbms_stats.gather_table_stats(user,'your_table')
PL/SQL procedure successfully completed.
SQL> alter session set statistics_level = all
2 /
Session altered.
马的查询:
SQL> with date_range as
2 ( select min(the_date) as oldest
3 , max(the_date) as recent
4 , max(the_date) - min(the_date) as total_days
5 from your_table
6 )
7 , all_dates as
8 ( select ( select oldest from date_range) + level as a_date
9 from dual
10 connect by level <= (select total_days from date_range)
11 )
12 select ad.a_date
13 from all_dates ad
14 left join your_table yt on ad.a_date = yt.the_date
15 where yt.the_date is null
16 order by ad.a_date
17 /
A_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00
2 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID gaqx49vb9gz9k, child number 0
-------------------------------------
with date_range as ( select min(the_date) as oldest , max(the_date) as recent , max(the_date) - min(the_date) as total_d
ays from your_table )
, all_dates as ( select ( select oldest from date_range) + level as a_date from dual connect by level <= (select total_days from
date_range) ) select
ad.a_date from all_dates ad left join your_table yt on ad.a_date = yt.the_date where yt.the_date is null order by ad.a_date
Plan hash value: 1419150012
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| 1 | TEMP TABLE TRANSFORMATION | | 1 | | 2 |00:00:00.01 | 22 | 1 | 1 | | | |
| 2 | LOAD AS SELECT | | 1 | | 1 |00:00:00.01 | 7 | 0 | 1 | 262K| 262K| 262K (0)|
| 3 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 4 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | 0 | 0 | | | |
| 5 | SORT ORDER BY | | 1 | 1 | 2 |00:00:00.01 | 12 | 1 | 0 | 2048 | 2048 | 2048 (0)|
|* 6 | FILTER | | 1 | | 2 |00:00:00.01 | 12 | 1 | 0 | | | |
|* 7 | HASH JOIN OUTER | | 1 | 1 | 7 |00:00:00.01 | 12 | 1 | 0 | 1048K| 1048K| 707K (0)|
| 8 | VIEW | | 1 | 1 | 6 |00:00:00.01 | 9 | 1 | 0 | | | |
| 9 | CONNECT BY WITHOUT FILTERING| | 1 | | 6 |00:00:00.01 | 3 | 0 | 0 | | | |
| 10 | FAST DUAL | | 1 | 1 | 1 |00:00:00.01 | 0 | 0 | 0 | | | |
| 11 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 12 | TABLE ACCESS FULL | SYS_TEMP_0FD9D660C_81240964 | 1 | 1 | 1 |00:00:00.01 | 3 | 0 | 0 | | | |
| 13 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | 0 | 0 | | | |
------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
6 - filter("YT"."THE_DATE" IS NULL)
7 - access("YT"."THE_DATE"=INTERNAL_FUNCTION("AD"."A_DATE"))
32 rows selected.
还有我的建议:
SQL> with all_dates_wo_boundary_values as
2 ( select oldest + level the_date
3 from ( select min(the_date) oldest
4 , max(the_date) recent
5 from your_table
6 )
7 connect by level <= recent - oldest - 1
8 )
9 select the_date
10 from all_dates_wo_boundary_values
11 minus
12 select the_date
13 from your_table
14 /
THE_DATE
-------------------
04-01-2012 00:00:00
06-01-2012 00:00:00
2 rows selected.
SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'))
2 /
PLAN_TABLE_OUTPUT
--------------------------------------------------------------------------------------------------------------------------------------
SQL_ID 7aavxmzkj7zq7, child number 0
-------------------------------------
with all_dates_wo_boundary_values as ( select oldest + level the_date from ( select min(the_date) oldest
, max(the_date) recent from your_table ) connect by level <= recent - oldest - 1 ) select
the_date from all_dates_wo_boundary_values minus select the_date from your_table
Plan hash value: 2293301832
-----------------------------------------------------------------------------------------------------------------------------------
| Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem |
-----------------------------------------------------------------------------------------------------------------------------------
| 1 | MINUS | | 1 | | 2 |00:00:00.01 | 6 | | | |
| 2 | SORT UNIQUE | | 1 | 1 | 5 |00:00:00.01 | 3 | 9216 | 9216 | 8192 (0)|
| 3 | VIEW | | 1 | 1 | 5 |00:00:00.01 | 3 | | | |
| 4 | CONNECT BY WITHOUT FILTERING| | 1 | | 5 |00:00:00.01 | 3 | | | |
| 5 | VIEW | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 6 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.01 | 3 | | | |
| 7 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | | | |
| 8 | SORT UNIQUE | | 1 | 7 | 5 |00:00:00.01 | 3 | 9216 | 9216 | 8192 (0)|
| 9 | TABLE ACCESS FULL | YOUR_TABLE | 1 | 7 | 7 |00:00:00.01 | 3 | | | |
-----------------------------------------------------------------------------------------------------------------------------------
22 rows selected.
问候,
罗布。
我们可以使用简单的分层查询,如下所示:
WITH CTE AS
(SELECT (SELECT MIN(COL1) FROM T)+LEVEL-1 AS OUT FROM DUAL
CONNECT BY (LEVEL-1) <= (SELECT MAX(COL1) - MIN(COL1) FROM T))
SELECT OUT FROM CTE WHERE OUT NOT IN (SELECT COL1 FROM T);
您需要一张Calendar
桌子(永久的或即时创建的)。然后你可以做一个简单的:
SELECT c.my_date
FROM
calendar c
JOIN
( SELECT MIN(date_column) AS min_date
, MAX(date_column) AS max_date
FROM tableX
) mm
ON c.mydate BETWEEN min_date AND max_date
WHERE
c.my_date NOT IN
( SELECT date_column
FROM tableX
)