1

因此,首先,我们要找出 2010 年支出最高的前 20% 的人:

select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2010-01-01'
    and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc

由此,我需要找出那些前 20% 的支出者在未来两年内的保留率。

意思是,在 2010 年的前 20% 中,哪一个在 2011 年和 2012 年仍然保持领先地位?注意:我需要计算 2010 年有多少,然后是 2011 年,然后是 2012 年。

我知道,如果我可以创建另一个表格或从仅列出顶级买家的 Excel 表格中提取,那会容易得多。但是,我没有对我们数据库的写访问权限,所以我必须在嵌套查询中完成所有操作,或者任何你们都必须提出的建议。我还是一个初学者,所以我不知道最好的方法。

谢谢!

4

5 回答 5

4

你有一个有趣的问题。从根本上说,它是关于从一年到下一年花费五分之一的迁移。我会通过查看这三年的所有五分之一来解决这个问题,看看人们在哪里移动。

首先是按年份和电子邮件汇总数据。关键功能是ntile()。老实说,我经常使用row_number()and自己进行计算count(),这就是为什么它们在 CTE 中(但随后没有使用):

with YearSummary as (
      select year(OrderDate) as yr, o.BillEmail, SUM(o.total) as TotalSpent,
             count(o.OrderID) as TotalOrders,
             row_number() over (partition by year(OrderDate) order by sum(o.Total) desc) as seqnum,
             count(*) over (partition by year(OrderDate)) as NumInYear,
             ntile(5) over (partition by year(OrderDate) order by sum(o.Total) desc) as Quintile
      from dbo.tblOrder o with (nolock)
      where o.DomainProjectID=13 and o.BillEmail not like ''
      group by o.BillEmail, year(OrderDate)
     )
select q2010, q2011, q2012,
       count(*) as NumEmails,
       min(BillEmail), max(BillEmail)
from (select BillEmail,
             max(case when yr = 2010 then Quintile end) as q2010,
             max(case when yr = 2011 then Quintile end) as q2011,
             max(case when yr = 2012 then Quintile end) as q2012
      from YearSummary
      group by BillEmail
     ) ys
group by q2010, q2011, q2012
order by 1, 2, 3;

最后一步是为每封电子邮件获取多行并将它们组合成计数。请注意,某些电子邮件在某些年份不会有任何支出,因此它们对应的Quintile将为 NULL(这实际上应该产生更像 180 行 - 5*6*6 - 而不是 125 行 - 5*5*5

我还在最终结果(min()max())中包含示例电子邮件,这样您就可以查看每个组的示例。

注意:对于保留率,请计算 (1, 1, 1)(所有年份的最高分)与 2010 年最高五分之一的总数之间的比率。

于 2013-08-05T15:32:40.527 回答
0

尝试这个:

;WITH top_2010 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2010-01-01'
        and o.OrderDate < '2011-01-01'
    group by o.BillEmail
), 
top_2011 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2011-01-01'
        and o.OrderDate < '2012-01-01'
    group by o.BillEmail
), 
top_2012 AS 
(
    select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2012-01-01'
        and o.OrderDate < '2013-01-01'
    group by o.BillEmail
)
SELECT top_2010.*, 
    ISNULL(top_2011.TotalSpent, 0) AS [TotalSpent_2011],ISNULL(top_2011.TotalOrders, 0) AS [TotalOrders_2011] ,
    ISNULL(top_2012.TotalSpent, 0) AS [TotalSpent_2012],ISNULL(top_2012.TotalOrders, 0) AS [TotalOrders_2012]
FROM top_2010
LEFT JOIN top_2011 ON top_2010.BillEmail = top_2011.BillEmail 
LEFT JOIN top_2012 ON top_2010.BillEmail = top_2012.BillEmail 
WHERE top_2011.BillEmail IS NOT NULL OR top_2012.BillEmail IS NOT NULL
order by top_2010.TotalSpent desc

请注意我正在使用LEFT JOIN,所以你可以看到所有那些在 2011 年2012 年排名前列的

如果您需要 2011 年和2012 年的那些,您可以更改为INNER JOIN

于 2013-08-05T15:19:44.507 回答
0

您可以使用公用表表达式来完成此操作。每年创建一个前 20% 的上市公司,然后加入他们的行列,找出哪些公司连续三年都在前五分之一。因为您只需要所有三年中都存在的记录,所以您不应该使用左连接。

WITH Top2010 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2010-01-01'
    and o.OrderDate < '2011-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2011 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2011-01-01'
    and o.OrderDate < '2012-01-01'
group by o.BillEmail
order by TotalSpent desc
),
Top2012 AS (
select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
    count(o.OrderID) as TotalOrders
from dbo.tblOrder o with (nolock)
where o.DomainProjectID=13
    and o.BillEmail not like ''
    and o.OrderDate >= '2012-01-01'
    and o.OrderDate < '2013-01-01'
group by o.BillEmail
order by TotalSpent desc
)

SELECT Top2010.BillEmail -- plus whatever other columns you want
FROM Top2010
INNER JOIN Top2011 ON Top2010.BillEmail = Top2011.BillEmail
INNER JOIN Top2012 ON Top2012.BillEmail = Top2011.BillEmail
于 2013-08-05T15:19:57.573 回答
0

就我个人而言,我会使用几个 CTE;每年一次。我还会更笼统地命名事物(而不是将年份名称嵌入任何地方)。一旦我们得到我们的结果集,我们就可以EXISTS用来检查所有 3 个时期都有谁。

-- Get the 1st Jan in the  current year
DECLARE @current_year date = DateAdd(yy, DateDiff(yy, 0, Current_Timestamp), 0);

; WITH highest_spenders_2_years_ago AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy, -2, @current_year)
  AND    o.orderDate <  DateAdd(yy, -1, @current_year)
)
, highest_spenders_last_year AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy, -1, @current_year)
  AND    o.orderDate <  DateAdd(yy,  0, @current_year)
)
, highest_spenders_this_year AS (
  <your_query>
  WHERE  o.orderDate >= DateAdd(yy,  0, @current_year)
  AND    o.orderDate <  DateAdd(yy,  1, @current_year)
)
SELECT *
FROM   highest_spenders_this_year
WHERE  EXISTS (
         SELECT *
         FROM   highest_spenders_last_year
         WHERE  BillEmail = highest_spenders_this_year.BillEmail
       )
AND    EXISTS (
         SELECT *
         FROM   highest_spenders_2_years_ago
         WHERE  BillEmail = highest_spenders_this_year.BillEmail
       )
于 2013-08-05T15:20:54.547 回答
0
SELECT Base2010.BillEmail, 
CASE WHEN retention2011.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2011, 
CASE WHEN retention2012.BillEmail = '' THEN 'Not retained' ELSE 'Retained' END AS retained2012
FROM 
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2010-01-01'
        and o.OrderDate < '2011-01-01'
    group by o.BillEmail
    ) AS Base2010

LEFT JOIN
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail 
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2011-01-01'
        and o.OrderDate < '2012-01-01'
    group by o.BillEmail
    ) AS retention2011
ON Base2010.BillEmail = retention2011.BillEmail

LEFT JOIN
    (select top (20)percent o.BillEmail,SUM(o.total) as TotalSpent,
        count(o.OrderID) as TotalOrders, BillEmail
    from dbo.tblOrder o with (nolock)
    where o.DomainProjectID=13
        and o.BillEmail not like ''
        and o.OrderDate >= '2012-01-01'
        and o.OrderDate < '2013-01-01'
    group by o.BillEmail
    ) AS retention2012
ON Base2010.BillEmail = retention2012.BillEmail
于 2013-08-05T15:23:35.770 回答