1

我有一个零售数据库,我想提取我的交易销售 AMT 以及产品级别的详细信息。

不幸的是,我意识到我的 SALE_AMT 列是我的总交易销售额,而不是按产品逐项列出的销售额。因此,我的 SALE_AMT 与唯一交易 ID 出现的次数重复(基于该交易中购买了多少产品)。

我认为解决这个问题并获得准确的交易 Sale_AMT 的最佳方法是在 SALE_AMT / 特定交易 ID 发生的次数上执行 AVG。但是,我很难弄清楚如何做到这一点。我的 SQL 如下:

WITH PRODUCT_CTE AS (
  ...
)

SELECT
A.TRANS_UNIQUE_KEY, //Unique Transaction Identifier//
...
A.SALE_AMT, // This is the field for SALE_AMT showing the entire transaction SALE AMT and duplicated for each product
...
D.PRDCT_DESC,
...
FROM "tablename"A

LEFT OUTER JOIN "tablename"G
ON A.DT_SKEY = G.DT_SKEY

LEFT OUTER JOIN "tablename"H
ON A.TM_SKEY = H.TM_SKEY

LEFT OUTER JOIN "tablename"C
ON A.STORE_KEY = C.STORE_KEY

INNER JOIN "tablename"E
ON A.TRANS_TYP_KEY = E.TRANS_TYP_KEY

LEFT OUTER JOIN "tablename"F
ON A.CUST_KEY = F.CUST_KEY

LEFT OUTER JOIN PRODUCT_CTE D
ON A.TRANS_UNIQUE_KEY=D.TRANS_UNIQUE_KEY

WHERE YEAR(G.FISCAL_DT)>= YEAR(CURRENT_DATE())-1
ORDER BY G.FISCAL_DT DESC

按产品分类的交易销售重复示例:
按产品分类的交易销售重复示例

4

1 回答 1

0

因此,您可以通过取平均值来获得每件商品的估计/平均/假货价格

WITH data_table AS (
    SELECT * FROM VALUES
        (1,'2020-01-27',517.66,'Rocking Chair'),
        (1,'2020-01-27',517.66,'Plush Animal'),
        (1,'2020-01-27',517.66,'Rug'),
        (1,'2020-01-27',517.66,'Couch'),
        (1,'2020-01-27',517.66,'Bar Stool'),
        (2,'2020-01-28',59.09,'Painting'),
        (2,'2020-01-28',59.09,'Rug')
        v(trans_unique_key, fiscal_dt, sale_amt, prdct_desc)
)
SELECT a.*
    ,sum(fake_item_cost_2dp)over(partition by trans_unique_key) AS sum_of_parts_not_eqaul_the_whole
FROM (
    SELECT *
        ,SALE_AMT/count(*)over(partition by trans_unique_key) as FAKE_ITEM_COST
        ,round(FAKE_ITEM_COST,2) AS fake_item_cost_2dp
    FROM data_table
) AS a
ORDER BY 2,1;

这使:

TRANS_UNIQUE_KEY    FISCAL_DT   SALE_AMT    PRDCT_DESC  FAKE_ITEM_COST  FAKE_ITEM_COST_2DP  SUM_OF_PARTS_NOT_EQAUL_THE_WHOLE
1   2020-01-27  517.66  Rocking Chair   103.53200000    103.53  517.65
1   2020-01-27  517.66  Plush Animal    103.53200000    103.53  517.65
1   2020-01-27  517.66  Rug 103.53200000    103.53  517.65
1   2020-01-27  517.66  Couch   103.53200000    103.53  517.65
1   2020-01-27  517.66  Bar Stool   103.53200000    103.53  517.65
2   2020-01-28  59.09   Painting    29.54500000 29.55   59.10
2   2020-01-28  59.09   Rug 29.54500000 29.55   59.10

通常认为舍入和求和,或平均再求和是不好的,和/或浮点数不稳定。

但是我的主要观点之一是 fake_per_item_price 没有多大意义,除非你要再次聚合回交易级别,此时就会有一些事情,比如ANY_VALUE让更有意义。

WITH data_table AS (
    SELECT * FROM VALUES
        (1,'2020-01-27',517.66,'Rocking Chair'),
        (1,'2020-01-27',517.66,'Plush Animal'),
        (1,'2020-01-27',517.66,'Rug'),
        (1,'2020-01-27',517.66,'Couch'),
        (1,'2020-01-27',517.66,'Bar Stool'),
        (2,'2020-01-28',59.09,'Painting'),
        (2,'2020-01-28',59.09,'Rug')
        v(trans_unique_key, fiscal_dt, sale_amt, prdct_desc)
)
SELECT trans_unique_key, fiscal_dt, ANY_VALUE(sale_amt) as sale_amt, count(*) as total_items_count, count(distinct prdct_desc) as distinct_items_count 
FROM data_table
GROUP BY 1,2
ORDER BY 2,1;

给予:

TRANS_UNIQUE_KEY    FISCAL_DT   SALE_AMT    TOTAL_ITEMS_COUNT   DISTINCT_ITEMS_COUNT
1   2020-01-27  517.66  5   5
2   2020-01-28  59.09   2   2
于 2020-01-30T21:30:15.653 回答