python - 日期交叉点和空间可用性

Question

我目前正在尝试检查日期范围内“空间”的可用性，而该日期范围可以无限长。表格如下：

空间：

id available_spaces 名称
1 20 空间 1
2 40 空间 2
3 10 空间 3

预订（end_date 可以为空，这意味着无限期预订）：

id space_id start_date end_date 空格
1 1 13/12-2017 空 9
1 1 2017 年 12 月 13 日 2018 年 12 月 13 日 10

然后我希望能够进行搜索，例如：

从： 11/12-2016
to：null（再次表示无穷无尽）
空间：2

此查询应返回空格：Space 2、Space 3，因为它们在该时间间隔内都有足够的可用性。

通过将搜索中所需的空格数量更改为 1 而不是 2 应该会产生以下结果：搜索：

从： 11/12-2016
to：null（再次表示无穷无尽）
空格：1

空间 1、空间 2、空间 3。我发现难以解决的问题是每个月都可以提供可变数量的空间，以及无限预订的能力。

score 4 · Accepted Answer

重访

与往常一样，SQL 提供了多种方法来解决给定的任务。最初提出的解决方案（如下）使用自连接，但另一种方法是利用窗口函数。这个想法是每次新预订开始时增加已用空间，并在结束时减少：

with bs as (
    select space_id as _sid
         , unnest(array[start_date,
                        coalesce(end_date, date 'infinity')]) as _d
         , unnest(array[spaces, -spaces]) as _sp
    from booking
    where end_date is null or end_date >= '2016-12-11'),
cs as (
    select _sid
        -- The inner sum collapses starting and ending bookings on the same
        -- date to a single spaces value, the outer is the running sum. This
        -- avoids the problem where the order of bookings starting or ending
        -- on the same date is unspecified and would produce possibly falsely
        -- high values for spaces, if all starting bookings just so happen to
        -- come first.
         , sum(sum(_sp)) over (partition by _sid
                               order by _d) as _usp
    from bs
    group by _sid, _d)
select *
from space
where not exists (
    select from cs
    where cs._sid = space.id
      and space.available_spaces - cs._usp < 2)

在 Python/SQLAlchemy 中也是如此：

from sqlalchemy import or_
from sqlalchemy.dialects.postgresql import array

bs = session.query(
        Booking.space_id,
        func.unnest(array([
            Booking.start_date,
            func.coalesce(Booking.end_date, func.date('infinity'))
        ])).label('date'),
        func.unnest(array([Booking.spaces, -Booking.spaces])).label('spaces')).\
    filter(or_(Booking.end_date == None,
               Booking.end_date >= '2016-12-11')).\
    cte()

cs = session.query(bs.c.space_id,
                   func.sum(func.sum(bs.c.spaces)).over(
                       partition_by=bs.c.space_id,
                       order_by=bs.c.date).label('spaces')).\
    group_by(bs.c.space_id, bs.c.date).\
    cte()

query = session.query(Space).\
    filter(~session.query(cs).
           filter(cs.c.space_id == Space.id,
                  Space.available_spaces - cs.c.spaces < 2).
           exists())

首先使用 SQL 解释查询的工作原理，然后构建 SQLAlchemy 更容易。我会假设预订和搜索总是有一个开始，或者换句话说，最终只能是无限的。使用范围类型和运算符，您应该首先找到与您的搜索重叠的预订。

select *
from booking
where daterange(start_date, end_date, '[)')
   && daterange('2016-12-11', null, '[)');

从找到的预订中，您需要找到交叉点并汇总已用空间。要查找交叉点，请使用预订的开头并查找包含它的预订。重复手头的所有预订。例如：

|-------| 5
.  .  .
.  |-------------| 2
.  .  .
.  .  |-------------------- 3
.  .  .              .
.  .  .              |---| 1
.  .  .              .
5  7  10             4

并以查询形式：

with bs as (
    select *
    from booking
    where daterange(start_date, end_date, '[)')
       && daterange('2016-12-11', null, '[)')
)
select distinct
       b1.space_id,
       sum(b2.spaces) as sum
from bs b1
join bs b2
  on b1.start_date <@ daterange(b2.start_date, b2.end_date, '[)')
 and b1.space_id = b2.space_id
group by b1.id, b1.space_id;

给定您的示例数据导致

 space_id | sum 
----------+-----
        1 |  19
(1 row)

因为只有 2 个预订，而且它们的开始日期相同。该查询远非最佳，并且对于每个范围都必须扫描所有范围，因此至少O(n^2) 。在程序设置中，您将使用间隔树等进行查找，并且可能通过一些合适的索引和更改也可以改进 SQL。

使用相交的预订总和，您可以检查是否不存在留下比搜索所需空间少的总和：

with bs as (
        select *
        from booking
        where daterange(start_date, end_date, '[)')
           && daterange('2016-12-11', null, '[)')
), cs as (
        select distinct
               b1.space_id,
               sum(b2.spaces) as sum
        from bs b1
        join bs b2
          on b1.start_date <@ daterange(b2.start_date, b2.end_date, '[)')
         and b1.space_id = b2.space_id
        -- Could also use distinct & sum() over (partition by b1.id) instead
        group by b1.id, b1.space_id
)
select *
from space
where not exists(
        select 1
        from cs
        where cs.space_id = space.id
              -- Check if there is not enough space
          and space.available_spaces - cs.sum < 2
);

由此可以直接形成 SQLAlchemy 版本：

from functools import partial
from sqlalchemy.dialects.postgresql import DATERANGE

# Hack. Proper type for passing daterange values is
# psycopg2.extras.DateRange, but that does not have
# the comparator methods.
daterange = partial(func.daterange, type_=DATERANGE)

bs = session.query(Booking).\
    filter(daterange(Booking.start_date, Booking.end_date, '[)').
           overlaps(daterange('2016-12-11', None, '[)'))).\
    cte()

bs1 = bs.alias()
bs2 = bs.alias()

cs = session.query(bs1.c.space_id,
                   func.sum(bs2.c.spaces).label('sum')).\
    distinct().\
    join(bs2, (bs2.c.space_id == bs1.c.space_id) &
              daterange(bs2.c.start_date,
                        bs2.c.end_date).contains(bs1.c.start_date)).\
    group_by(bs1.c.id, bs1.c.space_id).\
    cte()

query = session.query(Space).\
    filter(~session.query(cs).
           filter(cs.c.space_id == Space.id,
                  Space.available_spaces - cs.c.sum < 2).
           exists())

python - 日期交叉点和空间可用性

1 回答 1

重访

Related

Reference