sql - Calculating different tariff-periods for a call in SQL Server

Question

For a call-rating system, I'm trying to split a telephone call duration into sub-durations for different tariff-periods. The calls are stored in a SQL Server database and have a starttime and total duration. Rates are different for night (0000 - 0800), peak (0800 - 1900) and offpeak (1900-235959) periods.

For example: A call starts at 18:50:00 and has a duration of 1000 seconds. This would make the call end at 19:06:40, making it 10 minutes / 600 seconds in the peak-tariff and 400 seconds in the off-peak tariff.

Obviously, a call can wrap over an unlimited number of periods (we do not enforce a maximum call duration). A call lasting > 24 h can wrap all 3 periods, starting in peak, going through off-peak, night and back into peak tariff.

Currently, we are calculating the different tariff-periods using recursion in VB. We calculate how much of the call goes in the same tariff-period the call starts in, change the starttime and duration of the call accordingly and repeat this process till the full duration of the call has been reach (peakDuration + offpeakDuration + nightDuration == callDuration).

Regarding this issue, I have 2 questions:

Is it possible to do this effectively in a SQL Server statement? (I can think of subqueries or lots of coding in stored procedures, but that would not generate any performance improvement)
Will SQL Server be able to do such calculations in a way more resource-effective than the current VB scripts are doing it?

score 2 · Accepted Answer

在我看来，这是一个有两个阶段的操作。

确定电话的哪些部分在何时使用哪些费率。
将每个费率的时间相加。

第 1 阶段比第 2 阶段更棘手。我在 IBM Informix Dynamic Server (IDS) 中处理过这个示例，因为我没有 MS SQL Server。这些想法应该很容易翻译。INTO TEMP 子句创建一个具有适当模式的临时表；该表对会话是私有的，并在会话结束时消失（或者您明确删除它）。在 IDS 中，您还可以使用显式 CREATE TEMP TABLE 语句，然后使用 INSERT INTO temp-table SELECT ... 作为执行与 INTO TEMP 相同工作的更详细的方法。

正如在关于 SO 的 SQL 问题中经常出现的那样，您没有向我们提供模式，因此每个人都必须发明一个可能与您描述的内容匹配或不匹配的模式。

假设您的数据在两个表中。第一个表有通话记录记录，通话的基本信息，如拨打电话的电话、拨打的号码、通话开始的时间和通话时长：

CREATE TABLE clr  -- call log record
(
    phone_id      VARCHAR(24) NOT NULL,   -- billing plan
    called_number VARCHAR(24) NOT NULL,   -- needed to validate call
    start_time    TIMESTAMP   NOT NULL,   -- date and time when call started
    duration      INTEGER     NOT NULL    -- duration of call in seconds
                  CHECK(duration > 0),
    PRIMARY KEY(phone_id, start_time)
    -- other complicated range-based constraints omitted!
    -- foreign keys omitted
    -- there would probably be an auto-generated number here too.
);
INSERT INTO clr(phone_id, called_number, start_time, duration)
    VALUES('650-656-3180', '650-794-3714', '2009-02-26 15:17:19', 186234);

为方便起见（主要是为了节省多次写入加法），我想要一份带有实际结束时间的 clr 表的副本：

SELECT  phone_id, called_number, start_time AS call_start, duration,
        start_time + duration UNITS SECOND AS call_end
    FROM clr
    INTO TEMP clr_end;

关税数据存储在一个简单的表中：

CREATE TABLE tariff
(
    tariff_code   CHAR(1)      NOT NULL   -- code for the tariff
                  CHECK(tariff_code IN ('P','N','O'))
                  PRIMARY KEY,
    rate_start    TIME         NOT NULL,  -- time when rate starts
    rate_end      TIME         NOT NULL,  -- time when rate ends
    rate_charged  DECIMAL(7,4) NOT NULL   -- rate charged (cents per second)
);
INSERT INTO tariff(tariff_code, rate_start, rate_end, rate_charged)
    VALUES('N', '00:00:00', '08:00:00', 0.9876);
INSERT INTO tariff(tariff_code, rate_start, rate_end, rate_charged)
    VALUES('P', '08:00:00', '19:00:00', 2.3456);
INSERT INTO tariff(tariff_code, rate_start, rate_end, rate_charged)
    VALUES('O', '19:00:00', '23:59:59', 1.2345);

我争论关税表应该使用 TIME 还是 INTERVAL 值；在这种情况下，时间与相对于午夜的时间间隔非常相似，但时间间隔可以添加到时间不能添加的时间戳中。我坚持使用 TIME，但它让事情变得一团糟。

此查询的棘手部分是为每个关税生成相关的日期和时间范围，而无需循环。事实上，我最终使用了一个嵌入存储过程的循环来生成一个整数列表。（我还使用了一种特定于 IBM Informix Dynamic Server IDS 的技术，它使用系统目录中的表 ID 号作为 1..N 范围内的连续整数的源，它适用于版本中从 1 到 60 的数字11.50。）

CREATE PROCEDURE integers(lo INTEGER DEFAULT 0, hi INTEGER DEFAULT 0)
    RETURNING INT AS number;
    DEFINE i INTEGER;
    FOR i = lo TO hi STEP 1
        RETURN i WITH RESUME;
    END FOR;
END PROCEDURE;

在简单的情况下（也是最常见的情况），通话属于单一关税期；多期电话增加了兴奋。

假设我们可以创建一个与此模式匹配并涵盖我们可能需要的所有时间戳值的表表达式：

CREATE TEMP TABLE tariff_date_time
(
     tariff_code   CHAR(1)      NOT NULL,
     rate_start    TIMESTAMP    NOT NULL,
     rate_end      TIMESTAMP    NOT NULL,
     rate_charged  DECIMAL(7,4) NOT NULL
);

幸运的是，您没有提到周末费率，所以您向客户收取相同的费用

周末的价格与一周内的价格相同。但是，答案应该适应这样的

如果可能的话。如果您要变得像给出周末价格一样复杂

公共假期，圣诞节或新年除外，您收取高峰费率而不是

由于需求量大，周末费率，那么您最好将费率存储在永久关税日期时间表中。

填充关税_日期_时间的第一步是生成与呼叫相关的日期列表：

SELECT DISTINCT EXTEND(DATE(call_start) + number, YEAR TO SECOND) AS call_date
    FROM clr_end,
         TABLE(integers(0, (SELECT DATE(call_end) - DATE(call_start) FROM clr_end)))
         AS date_list(number)
    INTO TEMP call_dates;

两个日期值之间的差异是整数天数（在 IDS 中）。过程整数生成从 0 到调用所涵盖的天数的值，并将结果存储在临时表中。对于多条记录的更一般情况，最好计算最小和最大日期并生成其间的日期，而不是多次生成日期然后使用 DISTINCT 子句消除它们。

现在使用费率表的笛卡尔积和 call_dates 表来生成每天的费率信息。这是关税时间间隔更整齐的地方。

SELECT  r.tariff_code,
        d.call_date + (r.rate_start - TIME '00:00:00') AS rate_start,
        d.call_date + (r.rate_end   - TIME '00:00:00') AS rate_end,
        r.rate_charged
    FROM call_dates AS d, tariff AS r
    INTO TEMP tariff_date_time;

现在我们需要将通话记录与适用的资费相匹配。该条件是处理重叠的标准方法 - 如果第一个时间段的结束晚于第二个时间段的开始，并且如果第一个时间段的开始时间在第二个结束之前，则两个时间段重叠：

SELECT tdt.*, clr_end.*
FROM tariff_date_time tdt, clr_end
WHERE tdt.rate_end > clr_end.call_start
  AND tdt.rate_start < clr_end.call_end
INTO TEMP call_time_tariff;

然后我们需要确定费率的开始和结束时间。费率的开始时间是资费开始时间和通话开始时间中较晚的时间。费率结束时间是资费结束时间和通话结束时间中较早的时间：

SELECT  phone_id, called_number, tariff_code, rate_charged,
        call_start, duration,
        CASE WHEN rate_start < call_start THEN call_start
        ELSE rate_start END AS rate_start,
        CASE WHEN rate_end >= call_end THEN call_end
        ELSE rate_end END AS rate_end
    FROM call_time_tariff
    INTO TEMP call_time_tariff_times;

最后，我们需要将每个关税费率所花费的时间相加，并用该时间（以秒为单位）乘以所收取的费率。由于 SUM(rate_end - rate_start) 的结果是一个 INTERVAL，而不是一个数字，我不得不调用一个转换函数将 INTERVAL 转换为 DECIMAL 秒数，并且该（非标准）函数是 iv_seconds：

SELECT phone_id, called_number, tariff_code, rate_charged,
       call_start, duration,
       SUM(rate_end - rate_start) AS tariff_time,
       rate_charged * iv_seconds(SUM(rate_end - rate_start)) AS tariff_cost
   FROM call_time_tariff_times
   GROUP BY phone_id, called_number, tariff_code, rate_charged,
            call_start, duration;

对于示例数据，这产生了数据（为了简洁起见，我没有打印电话号码和被叫号码）：

N   0.9876   2009-02-26 15:17:19   186234   0 16:00:00   56885.760000000
O   1.2345   2009-02-26 15:17:19   186234   0 10:01:11   44529.649500000
P   2.3456   2009-02-26 15:17:19   186234   1 01:42:41  217111.081600000

这是一个非常昂贵的电话，但电信公司会很高兴的。您可以查看任何中间结果以查看答案是如何得出的。您可以以一些清晰为代价使用更少的临时表。

对于单个调用，这与在客户端中运行 VB 中的代码没有太大区别。对于很多呼叫，这有可能提高效率。我不相信在VB中递归是必要的——直接迭代就足够了。

score 1 · Accepted Answer

kar_vasile(id,vid,datein,timein,timeout,bikari,tozihat)
{
--- the bikari field is unemployment time  you can delete any where
select
            id,
            vid,
            datein,
            timein,
            timeout,
            bikari,
            hourwork =
            case when 
            timein <= timeout
            then
                SUM 
            (abs(DATEDIFF(mi, timein, timeout)) - bikari)/60 --
            calculate Hour 
        else
            SUM(abs(DATEDIFF(mi, timein, '23:59:00:00') + DATEDIFF(mi, '00:00:00', timeout) + 1) - bikari)/60 --
            calculate
            minute
                end
                ,
                minwork =
            case when 
            timein <= timeout
            then
                SUM 
            (abs(DATEDIFF(MI, timein, timeout)) - bikari)%60  --
            calclate Hour 
            starttime is later
            than endtime 
        else
            SUM(abs(DATEDIFF(mi, timein, '23:59:00:00') + DATEDIFF(mi, '00:00:00', timeout) + 1) - bikari)%60--
            calculate minute 
            starttime is later
            than
            endtime
                end, tozihat 

            from kar_vasile 
            group
            by id, vid, datein, timein, timeout, tozihat, bikari
}

score 0 · Accepted Answer

前提是您的通话持续时间少于100几天：

WITH generate_range(item) AS
(
    SELECT  0
    UNION ALL
    SELECT  item + 1
    FROM    generate_range
    WHERE   item < 100
)
SELECT tday, id, span
FROM   (
       SELECT   tday, id,
                DATEDIFF(minute,
                    CASE WHEN tbegin < clbegin THEN clbegin ELSE tbegin END,
                    CASE WHEN tend < clend THEN tend ELSE clend END
                ) AS span
        FROM    (
                SELECT  DATEADD(day, item, DATEDIFF(day, 0, clbegin)) AS tday,
                        ti.id,
                        DATEADD(minute, rangestart, DATEADD(day, item, DATEDIFF(day, 0, clbegin))) AS tbegin,
                        DATEADD(minute, rangeend, DATEADD(day, item, DATEDIFF(day, 0, clbegin))) AS tend
                FROM    calls, generate_range, tariff ti
                WHERE   DATEADD(day, 1, DATEDIFF(day, 0, clend)) > DATEADD(day, item, DATEDIFF(day, 0, clbegin))
                ) t1
        ) t2
WHERE   span > 0

我假设您将关税范围从午夜开始以分钟为单位，并以分钟为单位计算长度。

score 0 · Accepted Answer

在数据库级别执行这种计算的一个大问题是，它会在运行时从数据库中占用资源，无论是在 CPU 方面，还是在通过锁定的行和表的可用性方面。如果您在批处理操作中计算 1,000,000 个关税，那么这可能会在数据库上运行很长时间，在此期间您将无法将数据库用于其他任何事情。

如果您有资源，可以通过一个事务检索您需要的所有数据，并使用您选择的语言在数据库之外进行所有逻辑计算。然后插入所有结果。数据库用于存储和检索数据，它们执行的任何业务逻辑都应始终保持在绝对最低限度。尽管在某些方面表现出色，但 SQL 并不是日期或字符串操作工作的最佳语言。

我怀疑您的 VBA 工作已经走在了正确的轨道上，而且在不了解更多信息的情况下，这对我来说肯定是一个递归的，或者至少是一个迭代的问题。正确完成递归可以成为解决问题的强大而优雅的解决方案。很少占用数据库的资源。

score 0 · Accepted Answer

在 T-SQL 中有效吗？我怀疑不是，使用目前描述的模式。

但是，如果您的费率表存储每个日期的三个关税，则可能是可能的。除了手头的问题之外，您可能会这样做至少有一个原因：在某个时间点，某个时期或另一个时期的汇率可能会发生变化，您可能需要获得历史汇率。

所以说我们有这些表：

CREATE TABLE rates (
    from_date_time DATETIME
,   to_date_time DATETIME
,   rate MONEY
)

CREATE TABLE calls (
    id INT
,   started DATETIME
,   ended DATETIME
)

我认为需要考虑三种情况（可能更多，我正在编造这个）：

呼叫完全发生在一个费率周期内
通话在一个费率周期 (a) 开始，在下一个费率周期 (b) 结束
一次通话至少跨越一个完整的费率期

假设速率是每秒，我认为您可能会产生类似以下（完全未经测试）的查询

SELECT id, DATEDIFF(ss, started, ended) * rate /* case 1 */
FROM rates JOIN calls ON started > from_date_time AND ended < to_date_time
UNION
SELECT id, DATEDIFF(ss, started, to_date_time) * rate /* case 2a and the start of case 3 */
FROM rates JOIN calls ON started > from_date_time AND ended > to_date_time
UNION
SELECT id, DATEDIFF(ss, from_date_time, ended) * rate /* case 2b and the last part of case 3 */
FROM rates JOIN calls ON started < from_date_time AND ended < to_date_time
UNION
SELECT id, DATEDIFF(ss, from_date_time, to_date_time) * rate /* case 3 for entire rate periods, should pick up all complete periods */
FROM rates JOIN calls ON started < from_date_time AND ended > to_date_time

您可以在 SQL 中应用 SUM..GROUP BY 或在代码中处理它。或者，通过精心构建的逻辑，您可以将 UNIONed 部分合并到一个包含大量 AND 和 OR 的 WHERE 子句中。我认为 UNION 更清楚地表明了意图。

HTH & HIW（希望它有效...）

score 0 · Accepted Answer

这是一个关于我们在 sqlteam.com 上遇到的问题的线程。看一看，因为它包含一些非常巧妙的解决方案。

score 0 · Accepted Answer

继迈克伍德豪斯的回答之后，这可能对您有用：

SELECT id, SUM(DATEDIFF(ss, started, ended) * rate)
FROM rates 
JOIN calls ON 
     CASE WHEN started < from_date_time 
          THEN DATEADD(ss, 1, from_date_time) 
          ELSE started > from_date_time
   AND 
     CASE WHEN ended > to_date_time 
          THEN DATEADD(ss, -1, to_date_time) 
          ELSE ended END 
     < ended
GROUP BY id

score 0 · Accepted Answer

数据库中相关表的实际架构会非常有帮助。我会采取我最好的猜测。我假设 Rates 表的 start_time 和 end_time 作为午夜过后的分钟数。

使用日历表（大多数数据库中非常有用的表）：

SELECT
     C.id,
     R.rate,
     SUM(DATEDIFF(ss,
          CASE
               WHEN C.start_time < R.rate_start_time THEN R.rate_start_time
               ELSE C.start_time
          END,
          CASE
               WHEN C.end_time > R.rate_end_time THEN R.rate_end_time
               ELSE C.end_time
          END)) AS 
FROM
     Calls C
INNER JOIN
     (
     SELECT
          DATEADD(mi, Rates.start_time, CAL.calendar_date) AS rate_start_time,
          DATEADD(mi, Rates.end_time, CAL.calendar_date) AS rate_end_time,
          Rates.rate
     FROM
          Calendar CAL
     INNER JOIN Rates ON
          1 = 1
     WHERE
          CAL.calendar_date >= DATEADD(dy, -1, C.start_time) AND
          CAL.calendar_date <= C.start_time
     ) AS R ON
          R.rate_start_time < C.end_time AND
          R.rate_end_time > C.start_time
GROUP BY
     C.id,
     R.rate

我只是在打字时想出了这个，所以它未经测试，你很可能需要调整它，但希望你能看到大致的想法。

我也刚刚意识到您在通话中使用了 start_time 和持续时间。假设持续时间以秒为单位，您可以将 C.end_time 替换为 DATEADD(ss, C.start_time, C.duration) 。

假设适当的索引等，这应该在任何体面的 RDBMS 中执行得相当快。

sql - Calculating different tariff-periods for a call in SQL Server

8 回答 8

Related

Reference