0

得到以下查询。

SELECT
    customer_id,
    NTILE(5) OVER (ORDER BY MAX(oms_order_date)) AS r_score
FROM 
    mdwh.us_raw.l_dmw_order_report
WHERE 
    quantity_ordered > 0
    AND customer_id IS NOT NULL
    AND customer_id != ('')
    AND UPPER(line_status) NOT IN ('','RETURN', 'CANCELLED')
    AND UPPER(item_description_1) NOT IN ('','FREIGHT', 'RETURN LABEL FEE', 'VISIBLE STITCH')
    AND (quantity_ordered * unit_price_amount) > 0
    AND extended_amount < 1000 --NO BULK ORDERS
    AND oms_order_date BETWEEN '2020-01-01' AND '2020-01-01'
    AND SUBSTRING(upc,1,6) IN (SELECT item_code FROM item_master_zs WHERE new_division BETWEEN '11' AND '39')
GROUP BY
    customer_id
ORDER BY
    customer_id

我在这里所做的只是,在某些条件下,给我唯一的客户 ID,然后将他们的最新购买日期分成五分位数,并在第二列中为我提供分数。但是每次我运行查询时,r_score 值都会不断变化?我究竟做错了什么..?这是表格的一个片段(同样,r_score 值不断变化):

在此处输入图像描述

4

1 回答 1

1

问题在于,它通过在不同的组中放置相同的值ntile()来确保组的大小完全相同。

出于这个原因,我通常使用以下方法手动进行计算rank()

ceil(rank() over (order by max(oms_order_date)) * 5.0 /
     count(*) over ()
    ) as r_score

如果您使用row_number(),您将获得ntile().

如果要使用ntile(),可以使用附加order by键,以便排序键是唯一的。

====================

20 年 2 月 17 日下午 5:18 编辑

这是我正在使用的新代码:

SELECT
    customer_id,
    CEIL(RANK() OVER (ORDER BY MAX(oms_order_date)) * 5 / COUNT(*) OVER ()) AS r_score,
    CEIL(RANK() OVER (ORDER BY COUNT(client_web_order_number)) * 5 / COUNT(*) OVER ()) AS f_score,
    CEIL(RANK() OVER (ORDER BY AVG(extended_amount)) * 5 / COUNT(*) OVER ()) AS m_score,
    (r_score || f_score || m_score) AS rfm_score
FROM 
    mdwh.us_raw.l_dmw_order_report t1
WHERE 
    quantity_ordered > 0
    AND customer_id IS NOT NULL
    AND customer_id != ('')
    AND oms_order_date IS NOT NULL
    AND UPPER(line_status) NOT IN ('','RETURN', 'CANCELLED')
    AND UPPER(item_description_1) NOT IN ('','FREIGHT', 'RETURN LABEL FEE', 'VISIBLE STITCH')
    AND (quantity_ordered * unit_price_amount) > 0
    AND extended_amount < 1000 --NO BULK ORDERS
    AND oms_order_date BETWEEN '2020-01-01' AND '2020-01-10'
    AND SUBSTRING(upc,1,6) IN (SELECT item_code FROM item_master_zs WHERE new_division BETWEEN '11' AND '39')
GROUP BY
    customer_id
ORDER BY
    customer_id

现在的问题是我得到了一些带有空白 r_score 的行,最大值是 4 而不是 5..

于 2020-02-17T19:56:05.197 回答