0

我正在尝试通过某些数据更改来对表进行分区(不知道如何解释)

最好的例子

我有一个这样的表(例如简化):

| ID | TEST_DATA |                 LOGDATETIME |
------------------------------------------------
|  1 |         a | June, 19 2013 00:13:23+0000 |
|  2 |         a | June, 19 2013 00:13:24+0000 |
|  3 |         a | June, 19 2013 00:13:25+0000 |
|  4 |         b | June, 19 2013 00:13:26+0000 |
|  5 |         b | June, 19 2013 00:13:27+0000 |
|  6 |         b | June, 19 2013 00:13:28+0000 |
|  7 |         a | June, 19 2013 00:13:29+0000 |
|  8 |         a | June, 19 2013 00:13:30+0000 |
|  9 |         a | June, 19 2013 00:13:31+0000 |

我想通过测试数据进行分区(组),如下所示:

| ID | TEST_DATA |                 LOGDATETIME | grouping
------------------------------------------------
|  1 |         a | June, 19 2013 00:13:23+0000 | 1
|  2 |         a | June, 19 2013 00:13:24+0000 | 1
|  3 |         a | June, 19 2013 00:13:25+0000 | 1
|  4 |         b | June, 19 2013 00:13:26+0000 | 2
|  5 |         b | June, 19 2013 00:13:27+0000 | 2
|  6 |         b | June, 19 2013 00:13:28+0000 | 2
|  7 |         a | June, 19 2013 00:13:29+0000 | 3
|  8 |         a | June, 19 2013 00:13:30+0000 | 3
|  9 |         a | June, 19 2013 00:13:31+0000 | 3

我想保留日志时间的顺序,但每次 TEST_DATA 更​​改时都会创建一个新分组

SQLFiddle:http ://sqlfiddle.com/#!12/d9c17/1

4

1 回答 1

2

有点小技巧:

SELECT id, test_data, logdatetime
      ,SUM(CASE WHEN test_data = prev_data THEN 0 ELSE 1 END)
          OVER (ORDER BY id) + 1 AS grouping

  FROM ( SELECT id, test_data, logdatetime
               ,COALESCE( LAG(test_data) OVER(ORDER BY id)
                         ,test_data
                        ) AS prev_data
           FROM test t
       ) x

使用分析函数LAG将伪列添加到包含test_data前一行值的每一行。然后使用解析函数SUM每次递增一个累加器test_data与前一行的值不同。

在步骤:

postgres=# SELECT id, test_data, logdatetime
postgres-#       ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#       ,test_data) AS prev_data
postgres-#   FROM test t;
 id | test_data |       logdatetime       | prev_data
----+-----------+-------------------------+-----------
  1 | a         | 2013-06-19 00:13:23.184 | a
  2 | a         | 2013-06-19 00:13:24.312 | a
  3 | a         | 2013-06-19 00:13:25.184 | a
  4 | b         | 2013-06-19 00:13:26.184 | a
  5 | b         | 2013-06-19 00:13:27.184 | b
  6 | b         | 2013-06-19 00:13:28.184 | b
  7 | a         | 2013-06-19 00:13:29.184 | b
  8 | a         | 2013-06-19 00:13:30.184 | a
  9 | a         | 2013-06-19 00:13:31.184 | a
(9 rows)

postgres=# SELECT id, test_data, logdatetime
postgres-#       ,CASE WHEN test_data = prev_data THEN 0 ELSE 1 END AS counter
postgres-#   FROM  ( SELECT id, test_data, logdatetime
postgres(#                 ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#                 ,test_data) AS prev_data
postgres(#             FROM test t
postgres(#         ) x;
 id | test_data |       logdatetime       | counter
----+-----------+-------------------------+---------
  1 | a         | 2013-06-19 00:13:23.184 |       0
  2 | a         | 2013-06-19 00:13:24.312 |       0
  3 | a         | 2013-06-19 00:13:25.184 |       0
  4 | b         | 2013-06-19 00:13:26.184 |       1
  5 | b         | 2013-06-19 00:13:27.184 |       0
  6 | b         | 2013-06-19 00:13:28.184 |       0
  7 | a         | 2013-06-19 00:13:29.184 |       1
  8 | a         | 2013-06-19 00:13:30.184 |       0
  9 | a         | 2013-06-19 00:13:31.184 |       0


postgres=# SELECT id, test_data, logdatetime
postgres-#       ,SUM( CASE WHEN test_data = prev_data THEN 0 ELSE 1 END )
postgres-#            OVER (PARTITION BY 'x' ORDER BY id) + 1 AS grouping
postgres-#   FROM ( SELECT id, test_data, logdatetime
postgres(#                ,COALESCE( LAG(test_data) OVER(PARTITION BY 'x' ORDER BY id)
postgres(#                ,test_data) AS prev_data
postgres(#            FROM test t
postgres(#        ) x;
 id | test_data |       logdatetime       | grouping
----+-----------+-------------------------+----------
  1 | a         | 2013-06-19 00:13:23.184 |        1
  2 | a         | 2013-06-19 00:13:24.312 |        1
  3 | a         | 2013-06-19 00:13:25.184 |        1
  4 | b         | 2013-06-19 00:13:26.184 |        2
  5 | b         | 2013-06-19 00:13:27.184 |        2
  6 | b         | 2013-06-19 00:13:28.184 |        2
  7 | a         | 2013-06-19 00:13:29.184 |        3
  8 | a         | 2013-06-19 00:13:30.184 |        3
  9 | a         | 2013-06-19 00:13:31.184 |        3
(9 rows)      
于 2013-06-19T02:45:31.373 回答