1

I need to implement a timetravel view for prices in mysql. The base price table is this:

CREATE TABLE product_price (
  product_id INT(11) NOT NULL,
  date_valid TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
  price DECIMAL(15,4) NOT NULL,
  PRIMARY KEY(product_id,date_valid)
);

The idea is that as time passes I select the right valid prices I have entered in advance. Maybe the concept can be clearer later. I need to create a view so that for each product_id I get the latest price. After a while I have found the SELECT that does what I need:

SELECT * FROM  (
  SELECT product_id,price FROM product_pric
    WHERE date_valid <= CURRENT_TIMESTAMP
    ORDER BY date_valid DESC
) xx GROUP BY product_id;

In order to create the needed view I understood I cannot use the subselect and need to create one or more intermediate views. Like this:

CREATE VIEW v_product_price_time AS
  SELECT product_id,price FROM product_pric
    WHERE date_valid <= CURRENT_TIMESTAMP
    ORDER BY date_valid DESC
;


CREATE VIEW v_product_price AS
  SELECT * FROM v_product_price_time GROUP BY product_id;

What I then get is not the sameas the original query I've written. For example, I populate the table with just two rows:

INSERT INTO product_price (product_id,date_valid,price ) VALUES ( 1,'2013-01-01',41.40 );
INSERT INTO product_price (product_id,date_valid,price ) VALUES ( 1,'2013-01-03',42.0 );

The raw query returns the right data, (1,42.0), but querying the view doesn't. I always get (1,41.40).

Surely I am missing something as I don't know MySQL very well. With another opensource RDBMS I have already done similar stuff, but now I need to cope with MySQL v5.5 and have no way to change it. But the documentation and a few searches in the developers forums didn't lead me to a solution. Any idea on how to solve this? TIA.

4

2 回答 2

1

无论是否在视图中,都使用此查询。

SELECT p1.* FROM (
  SELECT * FROM product_price
  WHERE date_valid <= CURRENT_TIMESTAMP
) p1
LEFT JOIN (
  SELECT product_id, date_valid FROM product_price
  WHERE date_valid <= CURRENT_TIMESTAMP
) p2
ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
WHERE p2.date_valid IS NULL

该查询创建了 2 个派生表,效率不高,而且阅读起来也有点困难。您可以尝试为此创建另一个视图:

CREATE VIEW product_price_past_dates AS (
  SELECT * FROM product_price
  WHERE date_valid <= CURRENT_TIMESTAMP
);

然后将原始查询重写为:

SELECT p1.* FROM product_price_past_dates p1
LEFT JOIN product_price_past_dates p2
ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
WHERE p2.date_valid IS NULL

然后,您可以在使用前一个视图的查询上创建视图:

CREATE VIEW v_product_price_time AS (
  SELECT p1.* FROM product_price_past_dates p1
  LEFT JOIN product_price_past_dates p2
  ON p1.product_id = p2.product_id AND p1.date_valid < p2.date_valid
  WHERE p2.date_valid IS NULL
);

并以最简单的查询结束:

SELECT * FROM v_product_price_time;

在这里拉小提琴。

为什么 GROUP BY 不起作用:错误基本上在于对 GROUP BY 子句的不当使用。经验法则(尽管不是 100% 正确)将始终在选择中使用与 GROUP BY 中相同的字段。否则,MySQL 将从 select 中而不是 GROUP BY 中的字段中选择任何值。

有关更多信息,您应该查看MySQL 文档。我认为这很清楚。

非常详细的解释:

SELECT * FROM  (
  SELECT product_id,price FROM product_pric
    WHERE date_valid <= CURRENT_TIMESTAMP
    ORDER BY date_valid DESC
) xx GROUP BY product_id;

0 syntactical errors
1 semantical error
1 warning
  1. 语义错误:选择 product_id 和 price 并仅按 product_id 分组将导致每个价格返回不可预测的价格。您不希望结果集中有不可预测的值,否则,您不会选择它。因此,您 100% 信任一个您无法预测的值。这确实是一个错误
  2. 警告:您正在订购一个结果集,然后通过将其包装在 GROUP BY 中来摆脱该顺序。订购某物然后为其生成不同的订单是没有意义的。这会降低性能。

应使用 2 个基本的每组最大 n 解决方案之一来修复先前的查询。我提供了最短的一个,即左连接一个。

CREATE VIEW v_product_price_time AS
  SELECT product_id,price FROM product_pric
    WHERE date_valid <= CURRENT_TIMESTAMP
    ORDER BY date_valid DESC
;

0 syntactical errors
0 semantical errors
0 warnings

完全有效的查询。完全没有评论。

CREATE VIEW v_product_price AS
  SELECT * FROM v_product_price_time GROUP BY product_id;

0 syntactical errors
1 semantical errors
0 warnings
  1. 语义错误:再次选择 product_id 和 price 并且仅按 product_id 分组。与上述相同,这将导致不可预知的结果。

因此,您基本上是在比较 2 个不可预测的结果并期望结果相同。有趣的是,比较 2 个不可预测的结果比仅 1 个不可预测的结果更容易出错。因此,认为自己很幸运能够增加在代码中发现此错误的机会。问这个问题的人:如何将一个字段的值更新为另一个字段的最常用值?

当他发现查询没有像他预期的那样工作时,他会发现一个有趣的惊喜。此外,其中 4 个答案中有 3 个没有正确分组并返回不可预测的结果。更不用说 Bohemian 的评论,他在其中陈述了他的代码will always work。所以,恭喜恩佐,你刚刚除以 0 :)

希望这可以帮助。

于 2013-09-05T05:58:01.070 回答
0

这是我的做法:

SELECT *
FROM product_price pp
INNER JOIN (
  SELECT product_id,MAX(date_valid) most_recent_date
  FROM product_price pp
  WHERE date_valid <= CURRENT_TIMESTAMP
  GROUP BY product_id
) aux ON aux.product_id = pp.product_id
AND aux.most_recent_date = pp.date_valid

在子查询中,为每个 product_id 选择最近的有效日期以及 product_id。这将为每个 product_id (其日期有效,等于或小于当前时间戳)提供一行。现在您有了 product_id 和 date_valid(您的 PK),您可以安全地将这些行与 product_price 表连接起来以获取其余数据。这个“技巧”并不总是作为在子查询中选择组中的一行的解决方案,但由于您的要求是获取每个产品的最新日期,因此您可以利用 MAX 函数。

如果需要,可以将子查询放在视图中:

CREATE VIEW most_recent_product AS (
  SELECT product_id,MAX(date_valid) most_recent_date
  FROM product_price pp
  WHERE date_valid <= CURRENT_TIMESTAMP
  GROUP BY product_id
)

接着:

SELECT *
FROM product_price pp
INNER JOIN most_recent_product aux ON aux.product_id = pp.product_id
AND aux.most_recent_date = pp.date_valid

这是小提琴

于 2013-09-07T16:42:05.820 回答