3

我准备了一个简单的SQL Fiddle来演示我的问题 -

在 PostgreSQL 10.3 中,我将用户信息、两人游戏和移动存储在以下 3 个表中:

CREATE TABLE players (
    uid SERIAL PRIMARY KEY,
    name text NOT NULL
);

CREATE TABLE games (
    gid SERIAL PRIMARY KEY,
    player1 integer NOT NULL REFERENCES players ON DELETE CASCADE,
    player2 integer NOT NULL REFERENCES players ON DELETE CASCADE
);

CREATE TABLE moves (
    mid BIGSERIAL PRIMARY KEY,
    uid integer NOT NULL REFERENCES players ON DELETE CASCADE,
    gid integer NOT NULL REFERENCES games ON DELETE CASCADE,
    played timestamptz NOT NULL
);

假设 2 名玩家 Alice 和 Bob 已经玩了 3 场比赛:

INSERT INTO players (name) VALUES ('Alice'), ('Bob');
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);
INSERT INTO games (player1, player2) VALUES (1, 2);

让我们假设第一场比赛打得很快,每分钟都在下棋。

但后来他们冷静了 :-) 并玩了 2 场慢速比赛,每 10 分钟移动一次:

INSERT INTO moves (uid, gid, played) VALUES
(1, 1, now() + interval '1 min'),
(2, 1, now() + interval '2 min'),
(1, 1, now() + interval '3 min'),
(2, 1, now() + interval '4 min'),
(1, 1, now() + interval '5 min'),
(2, 1, now() + interval '6 min'),

(1, 2, now() + interval '10 min'),
(2, 2, now() + interval '20 min'),
(1, 2, now() + interval '30 min'),
(2, 2, now() + interval '40 min'),
(1, 2, now() + interval '50 min'),
(2, 2, now() + interval '60 min'),

(1, 3, now() + interval '110 min'),
(2, 3, now() + interval '120 min'),
(1, 3, now() + interval '130 min'),
(2, 3, now() + interval '140 min'),
(1, 3, now() + interval '150 min'),
(2, 3, now() + interval '160 min');

在一个包含游戏统计数据的网页上,我想显示每个玩家移动之间的平均时间。

所以我想我必须使用PostgreSQL的LAG窗口功能

由于可以同时玩几场比赛,我正在尝试PARTITION BY gid(即通过“游戏ID”)。

不幸的是,我得到一个语法错误窗口函数调用不能与我的 SQL 查询嵌套:

SELECT AVG(played - LAG(played) OVER (PARTITION BY gid order by played))
OVER (PARTITION BY gid order by played)
FROM moves
-- trying to calculate average thinking time for player Alice
WHERE uid = 1;

更新:

由于我的数据库中的游戏数量很大并且每天都在增长,我尝试(这里是新的SQL Fiddle)向内部选择查询添加条件:

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;

但是由于某种原因,这会将返回值彻底更改为 1 分 45 秒。

我想知道,为什么内部 SELECT 查询突然返回更多行,可能是我的 JOIN 中缺少某些条件?

更新 2:

哦,好吧,我明白为什么平均值会减少:通过具有相同时间戳(即played - prev_played = 0)的多行,但是如何修复 JOIN?

更新 3:

没关系,我错过了m.gid = g.gid AND我的 SQL JOIN 中的条件,现在它可以工作了

SELECT AVG(played - prev_played)
FROM (SELECT m.*,
      LAG(m.played) OVER (PARTITION BY m.gid ORDER BY played) AS prev_played
      FROM moves m
      JOIN games g ON (m.gid = g.gid AND m.uid in (g.player1, g.player2))
      WHERE m.played > now() - interval '1 month'
     ) m
WHERE uid = 1;
4

2 回答 2

2

您需要子查询来嵌套窗口函数。我认为这可以满足您的要求:

select avg(played - prev_played)
from (select m.*,
             lag(m.played) over (partition by gid order by played) as prev_played
      from moves m
     ) m
where uid = 1;

注意:where需要进入外部查询,所以不影响lag().

于 2018-04-23T14:55:29.347 回答
1

可能@gordon 的回答已经足够好了。但这不是您在评论中要求的结果。仅有效,因为每个游戏的数据具有相同的行数,因此游戏的平均值与完全平均值相同。但是如果你想要平均游戏,你需要一个额外的水平。

With cte as (
    SELECT gid, AVG(played - prev_played) as play_avg
    FROM (select m.*,
                 lag(m.played) over (partition by gid order by played) as prev_played
          from moves m      
         ) m
    WHERE uid = 1
    GROUP BY gid
)
   SELECT AVG(play_avg)
   FROM cte
;
于 2018-04-23T15:14:29.920 回答