sql-server - Reset Row Number on value change, but with repeat values in partition

Question

I'm having trouble with something very similar to this question T-sql Reset Row number on Field Change

The solution to this question is perfect, works fine. Except when I try with multiple other 'custno', it breaks down.

What I mean by that:

custno      moddate                     who
--------------------------------------------------
581827      2012-11-08 08:38:00.000     EMSZC14
581827      2012-11-08 08:41:10.000     EMSZC14
581827      2012-11-08 08:53:46.000     EMSZC14
581827      2012-11-08 08:57:04.000     EMSZC14
581827      2012-11-08 08:58:35.000     EMSZC14
581827      2012-11-08 08:59:13.000     EMSZC14
581827      2012-11-08 09:00:06.000     EMSZC14
581827      2012-11-08 09:04:39.000     EMSZC49 Reset row number to 1
581827      2012-11-08 09:05:04.000     EMSZC49
581827      2012-11-08 09:06:32.000     EMSZC49
581827      2012-11-08 09:12:03.000     EMSZC49
581827      2012-11-08 09:12:38.000     EMSZC49
581827      2012-11-08 09:14:18.000     EMSZC49
581827      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1
-- my new rows for example of problem
581829      2012-11-08 09:12:03.000     EMSZC14 1
581829      2012-11-08 09:12:38.000     EMSZC49 1
581829      2012-11-08 09:14:18.000     EMSZC49
581829      2012-11-08 09:17:35.000     EMSZC14 Reset row number to 1

The introduction of a new custno breaks this solution, which works perfectly for the one custno.

with C1 as
(
    select 
        custno, moddate, who,
        lag(who) over(order by moddate) as lag_who
    from 
        chr
),
C2 as
(
    select 
        custno, moddate, who,
        sum(case when who = lag_who then 0 else 1 end) 
            over(order by moddate rows unbounded preceding) as change 
    from 
        C1
)
select 
    row_number() over(partition by change order by moddate) as RowID,
    custno, moddate, who
from 
    C2

I'm sure it's only a little tweak to handle multiple custno's, but this is already way beyond my capabilities and I managed to make it work for my data but that was purely by replacing column and table names. Unfortunately don't have a detailed enough understanding to resolve the issue I have.

My data looks like

custno   start_date    value

effectively exactly the same. I want a Row/rank of 1 for every time the 'value' or 'who' changes, regardless if that value/who has been seen before. This is all relative to a custno. And I do see instances where a value/who can return back to the same value as well. Again, solution above handled that 'repetition' just fine... but for the one custno

I'm thinking I just need to somehow add some sort of grouping by custno in somewhere? Just not sure where or how

Thanks!

score 4 · Accepted Answer

这是一个间隙和孤岛问题，我们可以在这里使用行数差异法：

WITH cte AS (
    SELECT *,
        ROW_NUMBER() OVER (PARTITION BY custno ORDER BY moddate) rn1,
        ROW_NUMBER() OVER (PARTITION BY custno, who ORDER BY moddate) rn2
    FROM chr
)

SELECT custno, moddate, who,
    ROW_NUMBER() OVER (PARTITION BY custno, rn1 - rn2 ORDER BY moddate) rn
FROM cte
ORDER BY
    custno,
    moddate;

演示

对于此处使用的行号方法差异的解释rn1，根据您上面显示的数据，这只是每个客户从 1 开始按时间排序的序列。该rn2序列被另外划分为who. 在这种情况下，对于每个客户，两者之间的差异将始终具有相同的价值。正是有了这个区别，我们然后在整个表上取一个行号来生成你真正想要看到的序列。rn1rn2

score 0 · Accepted Answer

最后一个 ROW_NUMBER 子句应该是：

ROW_NUMBER() OVER (PARTITION BY custno, who, rn1 - rn2 ORDER BY custno, moddate) rn

尝试将第二条和第四条记录更改为 EMSZC49，您就会明白我的意思。只要前n 个记录与下n 个记录匹配，您就会遇到相同的问题。

sql-server - Reset Row Number on value change, but with repeat values in partition

2 回答 2

演示

Related

Reference