sql - PostgreSQL 交叉表函数

Question

我对交叉表功能有一些问题。

我的表“t”是

date;name;hour;cause;c_p
"2013-06-12";167;14;0;2
"2013-06-12";167;16;0;3
"2013-06-12";167;16;0;4
"2013-06-12";167;19;1;1
"2013-06-12";167;19;0;4

我会有这个“数据透视表” t_pivot

day;name;hour;cause_0;cause_1
"2013-06-12";167;14;2;0 -----sum(c_p)
"2013-06-12";167;16;7;0
"2013-06-12";167;19;4;1

Sql 代码是

SELECT  * from crosstab (
    'SELECT  day,name,hour,cause, SUM(c_p) AS c_p
        FROM t
        GROUP BY 1,2,3,4
        ORDER BY 3 ',

     'SELECT DISTINCT cause 
         FROM i
         ORDER BY 1')

AS t_pivot (day date, name integer,hour integer, cause_0 integer,cause_1 integer)

查询结果是一个行表，取决于“ORDER BY”

ORDER BY 3
"2013-06-12";167;14;4;1

ORDER BY 1, ORDER BY 2
"2013-06-12";167;14;7;1

错误在哪里？谢谢f。

score 1 · Accepted Answer

I haven't used crosstab function and can't test it now (there's no tablefunc extension on sqlfiddle), but in general I'd prefer simple SQL if I need such a pivot:

select
    date,
    hour,
    sum(case when cause = 0 then c_p else 0 end) cause_0,
    sum(case when cause = 1 then c_p else 0 end) cause_1
from t
group by date, hour
order by hour

sql fiddle demo

I think it's easier to maintain and read it in the future (but this is subjective opinion).

update This one works (hour used as a row_name, date and name are extra columns):

SELECT  * from crosstab (
    'select hour, date, name, cause, sum(c_p) as c_p
     from t
     group by 1, 2, 3, 4
     order by 1',
    'select distinct cause from t order by 1')
AS t_pivot (hour integer, date timestamp, name integer, cause_0 integer,cause_1 integer)

From documentation:

source_sql is a SQL statement that produces the source set of data. This statement must return one row_name column, one category column, and one value column. It may also have one or more "extra" columns. The row_name column must be first. The category and value columns must be the last two columns, in that order. Any columns between row_name and category are treated as "extra". The "extra" columns are expected to be the same for all rows with the same row_name value.

also

In practice the source_sql query should always specify ORDER BY 1 to ensure that values with the same row_name are brought together. However, ordering of the categories within a group is not important. Also, it is essential to be sure that the order of the category_sql query's output matches the specified output column order.

score 1 · Accepted Answer

似乎该crosstab函数只需要一列来保存行标识，如果我猜对了，你有 3 列：日期、名称和小时。但crosstab也允许额外的列，它不会将其视为特殊的，它只是将其添加到结果中。

因此，对于您的情况，您必须获取这三列并将它们表示为仅一列。我不确定这是否是最好的方法，但我使用row构造函数来做到这一点（所以我们不需要关心数据类型）：

SELECT  * from crosstab (                                           
    'select row(day,hour,name),day, hour, name, cause, sum(c_p) as c_p
     from t
     group by 2, 3, 4, 5
     order by 1',
    'VALUES(0),(1)')
AS t_pivot (row_name text, day date, hour int, name integer, cause_0 integer,cause_1 integer);

这将给出如下结果：

      row_name       |    day     | hour | name | cause_0 | cause_1 
---------------------+------------+------+------+---------+---------
 (2013-06-12,14,167) | 2013-06-12 |   14 |  167 |       2 |      --
 (2013-06-12,16,167) | 2013-06-12 |   16 |  167 |       7 |      --
 (2013-06-12,19,167) | 2013-06-12 |   19 |  167 |       4 |       1
 (2013-06-13,14,167) | 2013-06-13 |   14 |  167 |      10 |      --
(4 rows)

您不必担心第一列，它实际上对您毫无用处，因此我们可以将其删除：

SELECT day,hour,name,cause_0,cause_1 FROM (SELECT  * from crosstab (
    'select row(day,hour,name),day, hour, name, cause, sum(c_p) as c_p
     from t
     group by 2, 3, 4, 5
     order by 1',
    'VALUES(0),(1)')
AS t_pivot (row_name text, day date, hour int, name integer, cause_0 integer,cause_1 integer)) AS t;

还有一件事。请注意，我使用VALUES了第二个参数而不是SELECT DISTINCT，如果您确定这些将是唯一可用的值，这是一种更好的方法，如果它不是静态的，那么AS t_pivot ...也应该是动态的。

sql - PostgreSQL 交叉表函数

2 回答 2

Related

Reference