3

这是一个奇怪的问题。我不知道这是否可行。

假设我有下表:

person | product  | trans  | purchase_date
-------+----------+--------+---------------
jim    | square   | aaaa   | 2013-03-04 00:01:00
sarah  | circle   | aaab   | 2013-03-04 00:02:00
john   | square   | aac1   | 2013-03-04 00:03:00
john   | circle   | aac2   | 2013-03-04 00:03:10
jim    | triangle | aad1   | 2013-03-04 00:04:00
jim    | square   | abcd   | 2013-03-04 00:05:00
sarah  | square   | efgh   | 2013-03-04 00:07:00
jim    | circle   | ijkl   | 2013-03-04 00:22:00
sarah  | circle   | mnop   | 2013-03-04 00:24:00
sarah  | square   | qrst   | 2013-03-04 00:26:00
sarah  | circle   | uvwx   | 2013-03-04 00:44:00

我需要知道任何人购买的正方形和圆形(或圆形和正方形)之间的差异何时超过 10 分钟。理想情况下,我也想知道这种差异,但这不是必需的。

因此,这就是我需要的:

person | product  | trans  | purchase_date
-------+----------+--------+---------------
jim    | square   | abcd   | 2013-03-04 00:05:00
jim    | circle   | ijkl   | 2013-03-04 00:22:00
sarah  | square   | efgh   | 2013-03-04 00:07:00
sarah  | circle   | mnop   | 2013-03-04 00:24:00
sarah  | square   | qrst   | 2013-03-04 00:26:00
sarah  | circle   | uvwx   | 2013-03-04 00:44:00

这将每天运行,所以我将添加一个“where”子句以确保查询不会失控。另外,我知道可能会出现多笔交易(例如,购买一个圆圈之间有 20 分钟,然后是 20 分钟购买一个正方形,然后是 20 分钟再次购买一个圆圈,这意味着有 2 个实例时差超过 10 分钟)。

有什么建议吗?我在 postgres 8.1.23

4

3 回答 3

1

Modern day solution

With modern day Postgres (8.4 or later) you can use the window function row_number() to get a continuous numbering per group. Then you can left join to the previous and next row and see if either of them matches the criteria. Voilá.

WITH x AS (
   SELECT *
         ,row_number() OVER (PARTITION BY person ORDER BY purchase_date) AS rn
   FROM   tbl
   WHERE  product IN ('circle', 'square')
   )
SELECT x.person, x.product, x.trans, x.purchase_date
FROM   x
LEFT   JOIN x y ON y.person = x.person AND y.rn = x.rn + 1
LEFT   JOIN x z ON z.person = x.person AND z.rn = x.rn - 1
WHERE (y.product <> x.product
       AND y.purchase_date > x.purchase_date + interval '10 min')
   OR (z.product <> x.product
       AND z.purchase_date < x.purchase_date - interval '10 min')
ORDER  BY x.person, x.purchase_date;

SQLfiddle.

Solution for Postgres 8.1

I can't test this on Postgres 8.1, no surviving instance available. Tested and works on v8.4 and should work for you, too. Temporary sequences and temporary tables and and CREATE TABLE AS were already available.
Temporary sequence and table are only visible to you, so you can get continuous numbers even with concurrent queries.

CREATE TEMP SEQUENCE s;

CREATE TEMP TABLE x AS
SELECT *, nextval('s') AS rn  -- get row-numbers from sequence
FROM  (
   SELECT *
   FROM   tbl
   WHERE  product IN ('circle', 'square')
   ORDER  BY person, purchase_date  -- need to order in a subquery first!
   ) a;

Then the same SELECT as above should work:

SELECT x.person, x.product, x.trans, x.purchase_date
FROM   x
LEFT   JOIN x y ON y.person = x.person AND y.rn = x.rn + 1
LEFT   JOIN x z ON z.person = x.person AND z.rn = x.rn - 1
WHERE (y.product <> x.product
       AND y.purchase_date > x.purchase_date + interval '10 min')
   OR (z.product <> x.product
       AND z.purchase_date < x.purchase_date - interval '10 min')
ORDER  BY x.person, x.purchase_date;
于 2013-03-05T04:56:20.850 回答
0

您可以尝试使用这样的“ON”子句将表连接到自身:

SELECT a.Person, CAST((DATEDIFF(mi, b.purchaseDateb a.purchaseDate)/60.0) AS Decimal) AS TimeDiff, a.Product, b.Product FROM <TABLE> a
JOIN <TABLE> b
ON a.Person = b.Person AND b.purchaseDate > a.purchaseDate
WHERE
(a.Product = 'Circle' AND b.Product = 'Square')
OR
(a.Product = 'Square' AND b.Product = 'Circle')

通过将表连接到自身,您可以获得将同一个人的两次购买组合在一起的行。通过将其限制为“b.purchaseDate > a.purchaseDate”,您可以防止行匹配自己。然后您可以简单地检查购买的不同产品。

时差是最后一个棘手的部分。我上面包含的内容是基于我在这里找到的答案。看起来它应该可以工作,如果此输出对您不起作用,您可以使用一些变体。

您需要在 WHERE 语句上添加一个子句,该子句使用相同的 DATEDIFF 函数来测试 > 10 分钟的时间,但这不会构成太大的挑战。

Please note that this won't return exactly what you have in your question - this will include a row for Jim's first transaction as well as one for Jim's 2nd square purchase. Both will match to the same circle, and you will get both times (ijkl-abcd AND ijkl-aaaa). Thanks for xQbert's comment for pointing this out.

于 2013-03-04T18:46:11.207 回答
0

--Assumes

  1. You want to know differences in minutes for purchase on same day. If dates don't matter eliminate the where clause.
  2. That you only want considerations of circle to square following the purchase_date, not preceding. .

.

SELECT A.person, A.product, a.Trans, A.Purchase_date, B.Purchase_date, 
hours_diff * 60 + DATE_PART('minute', B.purchase_date - A.Purchase_date ) as minuteDifference
FROM yourTable A
LEFT JOIN yourTable B 
  on A.person = B.Person 
    and ((A.product = 'square' and b.product = 'circle') 
      OR (A.Product = 'circle' and b.product = 'square'))
    and A.purchase_date <= B.Purchase_date
WHERE (A.purchase_Date::date = B.purchase_date::date OR B.purchase_date is null)

Null B.purchase_dates will tell you when you don't have a circle/square or square circle combo.

于 2013-03-04T19:02:00.087 回答