“标准”SQL
类似于我在上一个问题上发布的内容,递归 CTE很优雅,并且可能是标准 SQL中最快的方法——尤其是对于每个用户的许多行。
WITH RECURSIVE t AS (
SELECT row_number() OVER (PARTITION BY usr ORDER BY id DESC) AS rn
,usr, cola, colb, colc
FROM tbl
)
, x AS (
SELECT rn, usr, cola, colb, colc
FROM t
WHERE rn = 1
UNION ALL
SELECT t.rn, t.usr
, COALESCE(x.cola, t.cola)
, COALESCE(x.colb, t.colb)
, COALESCE(x.colc, t.colc)
FROM x
JOIN t USING (usr)
WHERE t.rn = x.rn + 1
AND (x.cola IS NULL OR x.colb IS NULL OR x.colc IS NULL)
)
SELECT DISTINCT ON (usr)
usr, cola, colb, colc
FROM x
ORDER BY usr, rn DESC;
-> 用于请求的 PostgreSQL 的 sqlfiddle。
唯一的非标准元素是DISTINCT ON
,它是标准中的扩展DISTINCT
。将 final 替换SELECT
为标准 SQL:
SELECT usr
,max(cola) As cola
,max(colb) As colb
,max(colc) As colc
FROM x
GROUP BY usr
ORDER BY usr;
“标准 SQL”的请求用途有限。该标准仅存在于纸面上。没有 RDBMS 实现 100% 的标准 SQL - 这也毫无意义,因为该标准到处都包含无意义的部分。可以说,PostgreSQL 的实现是最接近标准的。
PL/pgSQL 函数
此解决方案特定于 PostgreSQL,但性能应该非常好。
我在上面小提琴中演示的同一张桌子上建造。
CREATE OR REPLACE FUNCTION f_last_nonull_per_user()
RETURNS SETOF tbl AS
$func$
DECLARE
_row tbl; -- table name can be used as row type
_new tbl;
BEGIN
FOR _new IN
SELECT * FROM tbl ORDER BY usr, id DESC
LOOP
IF _new.usr = _row.usr THEN
_row.id := _new.id; -- copy only id
IF _row.cola IS NULL AND _new.cola IS NOT NULL THEN
_row.cola := _new.cola; END IF; -- only if no value found yet
IF _row.colb IS NULL AND _new.colb IS NOT NULL THEN
_row.colb := _new.colb; END IF;
IF _row.colc IS NULL AND _new.colc IS NOT NULL THEN
_row.colc := _new.colc; END IF;
ELSE
IF _new.usr <> _row.usr THEN -- doesn't fire on first row
RETURN NEXT _row;
END IF;
_row := _new; -- remember row for next iteration
END IF;
END LOOP;
RETURN NEXT _row; -- return row for last usr
END
$func$ LANGUAGE plpgsql;
称呼:
SELECT * FROM f_last_nonull_per_user();
返回整行 - 包括id
我们需要填充所有列的最小值。