5

想象一下下Loans表:

BorrowerID       StartDate         DueDate
=============================================
1                2012-09-02        2012-10-01
2                2012-10-05        2012-10-21
3                2012-11-07        2012-11-09
4                2012-12-01        2013-01-01
4                2012-12-01        2013-01-14
1                2012-12-20        2013-01-06
3                2013-01-07        2013-01-22
3                2013-01-15        2013-01-18
1                2013-02-20        2013-02-24

我将如何选择BorrowerID那些一次只借过一笔贷款的人?这包括只借过一笔贷款的借款人,以及那些借过不止一笔贷款的借款人,前提是如果你要画出他们的贷款时间线,他们中的任何一个都不会重叠。例如,在上表中,它应该只找到借款人 1 和 2。

我曾尝试尝试将表连接到自身,但并没有真正成功。非常感谢任何指针!

4

4 回答 4

10

使用 PRIMARY KEY 的 dbo.Loan 解决方案

要解决此问题,您需要以下SQL Fiddle中详述的两步方法。我确实在您的示例数据中添加了一个 LoanId 列,并且查询要求存在这样的唯一 ID。如果没有,则需要调整 join 子句以确保贷款不会与其自身匹配。

MS SQL Server 2008 架构设置

CREATE TABLE dbo.Loans
    (LoanID INT, [BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO

INSERT INTO dbo.Loans
    (LoanID, [BorrowerID], [StartDate], [DueDate])
VALUES
    (1, 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
    (2, 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
    (3, 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
    (4, 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
    (5, 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
    (6, 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
    (7, 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
    (8, 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
    (9, 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO

首先,您需要找出哪些贷款与另一笔贷款重叠。该查询用于<=比较开始日期和截止日期。这将第二笔在同一天开始的贷款计算为重叠。如果您需要那些不重叠的地方,请<改为在两个地方使用。

查询 1

SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L2.LoanID <> L1.LoanID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate) 
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM dbo.Loans L1;

结果

| LOANID | BORROWERID |                        STARTDATE |                         DUEDATE | HASOVERLAPPINGLOAN |
|--------|------------|----------------------------------|---------------------------------|--------------------|
|      1 |          1 | September, 02 2012 00:00:00+0000 |  October, 01 2012 00:00:00+0000 |                  0 |
|      2 |          2 |   October, 05 2012 00:00:00+0000 |  October, 21 2012 00:00:00+0000 |                  0 |
|      3 |          3 |  November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 |                  0 |
|      4 |          4 |  December, 01 2012 00:00:00+0000 |  January, 01 2013 00:00:00+0000 |                  1 |
|      5 |          4 |  December, 01 2012 00:00:00+0000 |  January, 14 2013 00:00:00+0000 |                  1 |
|      6 |          1 |  December, 20 2012 00:00:00+0000 |  January, 06 2013 00:00:00+0000 |                  0 |
|      7 |          3 |   January, 07 2013 00:00:00+0000 |  January, 22 2013 00:00:00+0000 |                  1 |
|      8 |          3 |   January, 15 2013 00:00:00+0000 |  January, 18 2013 00:00:00+0000 |                  1 |
|      9 |          1 |  February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 |                  0 |

现在,使用该信息,您可以使用此查询确定没有重叠贷款的借款人:

查询 2

WITH OverlappingLoans AS (
  SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L2.LoanID <> L1.LoanID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate) 
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM dbo.Loans L1
),
OverlappingBorrower AS (
  SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
    FROM OverlappingLoans
   GROUP BY BorrowerID
)
SELECT * 
  FROM OverlappingBorrower
 WHERE hasOverlappingLoan = 0;

或者,您甚至可以通过计算贷款以及计算数据库中每个借款人与其他贷款重叠的贷款数量来获得更多信息。(注意,如果贷款 A 和贷款 B 重叠,则本次查询都将被视为重叠贷款)

结果

| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
|          1 |                  0 |
|          2 |                  0 |

查询 3

WITH OverlappingLoans AS (
  SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L2.LoanID <> L1.LoanID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate) 
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM dbo.Loans L1
)
SELECT BorrowerID,COUNT(1) LoanCount, SUM(hasOverlappingLoan) OverlappingCount
  FROM OverlappingLoans
 GROUP BY BorrowerID;

结果

| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
|          1 |         3 |                0 |
|          2 |         1 |                0 |
|          3 |         3 |                2 |
|          4 |         2 |                2 |


没有 PRIMARY KEY 的 dbo.Loan 解决方案

更新:由于要求实际上需要一个不依赖于每笔贷款的唯一标识符的解决方案,因此我进行了以下更改:

1)我添加了一个借款人,该借款人有两笔开始和到期日相同的贷款

SQL小提琴

MS SQL Server 2008 架构设置

CREATE TABLE dbo.Loans
    ([BorrowerID] int, [StartDate] datetime, [DueDate] datetime)
GO

INSERT INTO dbo.Loans
    ([BorrowerID], [StartDate], [DueDate])
VALUES
    ( 1, '2012-09-02 00:00:00', '2012-10-01 00:00:00'),
    ( 2, '2012-10-05 00:00:00', '2012-10-21 00:00:00'),
    ( 3, '2012-11-07 00:00:00', '2012-11-09 00:00:00'),
    ( 4, '2012-12-01 00:00:00', '2013-01-01 00:00:00'),
    ( 4, '2012-12-01 00:00:00', '2013-01-14 00:00:00'),
    ( 1, '2012-12-20 00:00:00', '2013-01-06 00:00:00'),
    ( 3, '2013-01-07 00:00:00', '2013-01-22 00:00:00'),
    ( 3, '2013-01-15 00:00:00', '2013-01-18 00:00:00'),
    ( 1, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
    ( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00'),
    ( 5, '2013-02-20 00:00:00', '2013-02-24 00:00:00')
GO

2)那些“等日期”贷款需要一个额外的步骤:

查询 1

SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
  FROM dbo.Loans
 GROUP BY BorrowerID, StartDate, DueDate;

结果

| BORROWERID |                        STARTDATE |                         DUEDATE | LOANCOUNT |
|------------|----------------------------------|---------------------------------|-----------|
|          1 | September, 02 2012 00:00:00+0000 |  October, 01 2012 00:00:00+0000 |         1 |
|          1 |  December, 20 2012 00:00:00+0000 |  January, 06 2013 00:00:00+0000 |         1 |
|          1 |  February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 |         1 |
|          2 |   October, 05 2012 00:00:00+0000 |  October, 21 2012 00:00:00+0000 |         1 |
|          3 |  November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 |         1 |
|          3 |   January, 07 2013 00:00:00+0000 |  January, 22 2013 00:00:00+0000 |         1 |
|          3 |   January, 15 2013 00:00:00+0000 |  January, 18 2013 00:00:00+0000 |         1 |
|          4 |  December, 01 2012 00:00:00+0000 |  January, 01 2013 00:00:00+0000 |         1 |
|          4 |  December, 01 2012 00:00:00+0000 |  January, 14 2013 00:00:00+0000 |         1 |
|          5 |  February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 |         2 |

3)现在,每个贷款范围都是独一无二的,我们可以再次使用旧技术。但是,我们还需要考虑那些“等日期”贷款。(L1.StartDate <> L2.StartDate OR L1.DueDate <> L2.DueDate)防止贷款与自身匹配。OR LoanCount > 1“等日期”贷款的帐户。

查询 2

WITH NormalizedLoans AS (
  SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
    FROM dbo.Loans
   GROUP BY BorrowerID, StartDate, DueDate  
)
SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate
                       AND (L1.StartDate <> L2.StartDate
                            OR L1.DueDate <> L2.DueDate)
                   ) 
             OR LoanCount > 1
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM NormalizedLoans L1;

结果

| BORROWERID |                        STARTDATE |                         DUEDATE | LOANCOUNT | HASOVERLAPPINGLOAN |
|------------|----------------------------------|---------------------------------|-----------|--------------------|
|          1 | September, 02 2012 00:00:00+0000 |  October, 01 2012 00:00:00+0000 |         1 |                  0 |
|          1 |  December, 20 2012 00:00:00+0000 |  January, 06 2013 00:00:00+0000 |         1 |                  0 |
|          1 |  February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 |         1 |                  0 |
|          2 |   October, 05 2012 00:00:00+0000 |  October, 21 2012 00:00:00+0000 |         1 |                  0 |
|          3 |  November, 07 2012 00:00:00+0000 | November, 09 2012 00:00:00+0000 |         1 |                  0 |
|          3 |   January, 07 2013 00:00:00+0000 |  January, 22 2013 00:00:00+0000 |         1 |                  1 |
|          3 |   January, 15 2013 00:00:00+0000 |  January, 18 2013 00:00:00+0000 |         1 |                  1 |
|          4 |  December, 01 2012 00:00:00+0000 |  January, 01 2013 00:00:00+0000 |         1 |                  1 |
|          4 |  December, 01 2012 00:00:00+0000 |  January, 14 2013 00:00:00+0000 |         1 |                  1 |
|          5 |  February, 20 2013 00:00:00+0000 | February, 24 2013 00:00:00+0000 |         2 |                  1 |

这个查询逻辑没有改变(除了切换开头)。

查询 3

WITH NormalizedLoans AS (
  SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
    FROM dbo.Loans
   GROUP BY BorrowerID, StartDate, DueDate  
),
OverlappingLoans AS (
SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate
                       AND (L1.StartDate <> L2.StartDate
                            OR L1.DueDate <> L2.DueDate)
                   ) 
             OR LoanCount > 1
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM NormalizedLoans L1
),
OverlappingBorrower AS (
  SELECT BorrowerID, MAX(HasOverlappingLoan) HasOverlappingLoan
    FROM OverlappingLoans
   GROUP BY BorrowerID
)
SELECT * 
  FROM OverlappingBorrower
 WHERE hasOverlappingLoan = 0;

结果

| BORROWERID | HASOVERLAPPINGLOAN |
|------------|--------------------|
|          1 |                  0 |
|          2 |                  0 |

4)在这个计数查询中,我们需要再次合并“等日期”贷款计数。为此,我们使用SUM(LoanCount)而不是普通的COUNT. 我们还必须乘以hasOverlappingLoanLoanCount 才能再次获得正确的重叠计数。

问题 4

WITH NormalizedLoans AS (
  SELECT BorrowerID, StartDate, DueDate, COUNT(1) LoanCount
    FROM dbo.Loans
   GROUP BY BorrowerID, StartDate, DueDate  
),
OverlappingLoans AS (
SELECT 
   *,
   CASE WHEN EXISTS(SELECT 1 FROM dbo.Loans L2 
                     WHERE L2.BorrowerID = L1.BorrowerID
                       AND L1.StartDate <= L2.DueDate
                       AND L2.StartDate <= l1.DueDate
                       AND (L1.StartDate <> L2.StartDate
                            OR L1.DueDate <> L2.DueDate)
                   ) 
             OR LoanCount > 1
        THEN 1
        ELSE 0
   END AS HasOverlappingLoan
  FROM NormalizedLoans L1
)
SELECT BorrowerID,SUM(LoanCount) LoanCount, SUM(hasOverlappingLoan*LoanCount) OverlappingCount
  FROM OverlappingLoans
 GROUP BY BorrowerID;

结果

| BORROWERID | LOANCOUNT | OVERLAPPINGCOUNT |
|------------|-----------|------------------|
|          1 |         3 |                0 |
|          2 |         1 |                0 |
|          3 |         3 |                2 |
|          4 |         2 |                2 |
|          5 |         2 |                2 |

我强烈建议找到一种方法来使用我的第一个解决方案,因为没有主键的贷款表是一种,比方说“奇怪”的设计。但是,如果您真的无法到达那里,请使用第二种解决方案。

于 2013-10-13T23:22:34.220 回答
1

我让它工作了,但有点复杂。它首先在内部查询中获取不符合条件的借款人,然后返回其余的借款人。内部查询有两部分:

获取所有重叠借款不是在同一天开始的。

从同一日期开始获得所有借款。

select distinct BorrowerID from borrowings
where BorrowerID NOT IN

(
    select b1.BorrowerID from borrowings b1
    inner join borrowings b2
        on b1.BorrowerID = b2.BorrowerID
        and b1.StartDate < b2.StartDate
        and b1.DueDate > b2.StartDate

    union 

    select BorrowerID from borrowings
    group by BorrowerID, StartDate
    having count(*) > 1
)

我不得不使用 2 个单独的内部查询,因为您的表没有每个记录的唯一标识符,并且b1.StartDate <= b2.StartDate按照我应该使用的方式对自身进行记录连接。最好为每条记录设置一个单独的标识符。

于 2013-10-13T22:57:13.543 回答
0

如果您使用的是 SQL 2012,则可以这样做:

with cte as (
select 
    BorrowerID, 
    StartDate, 
    DueDate,
    lag(DueDate) over (partition by borrowerid order by StartDate, DueDate) as PrevDueDate
from test
)

select 
    distinct BorrowerID 
from cte
where BorrowerID not in
    (select BorrowerID 
    from cte 
    where StartDate <= PrevDueDate)
于 2013-10-14T00:56:33.670 回答
0

尝试

with cte as 
(
    select *, 
      row_number() over (partition by b order by s) r
      from loans
 )

select l1.b
from loans l1
except
select c1.b
from cte c1
where exists (
    select 1
    from cte c2 
    where c2.b = c1.b
    and c2.r <> c1.r
    and (c2.s between c1.s and c1.e
             or c1.s between c2.s and c2.e)
 )
于 2013-10-14T00:05:12.677 回答