0

我已经在这里看到了这篇文章(http://stackoverflow.com/questions/1398113/sql-select-one-row-randomly-but-taking-into-account-a-weight),但无法解决。我应该把“东西”表放在哪里?他们为什么不使用 NEWID() 而不是 RND()?

餐桌用品

id     item       weight       location
1      ball       1            Wyoming
2      cup        2            Alaska
3      sock       1            Idaho
4      car        3            Miami
5      hot girl   5            Brazil

现在根据上面引用的那篇文章,我应该这样做

SELECT      TOP 1 t.*
FROM        @Table t
INNER JOIN (SELECT t.id, sum(tt.weight) AS cum_weight
            FROM        @Table t
            INNER JOIN  @Table tt ON  tt.id <= t.id
            GROUP BY    t.id) tc
        ON  tc.id = t.id,
           (SELECT  SUM(weight) AS total_weight FROM @Table) tt,
           (SELECT  RAND() AS rnd) r
WHERE       r.rnd * tt.total_weight <= tc.cum_weight
ORDER BY    t.id ASC

我想做上面的事情,但是以这种方式:

SELECT TOP (1) from stuff WHERE blahblahblah AND (location='Brazil' OR location='Wyoming' OR location='Brazil') AND (weight <= cum_weight) ORDER BY NEWID()

我只是猜测我可以使用 NEWID() 而不是被迫使用 RND()

4

1 回答 1

0

您可以通过从累积总和而不是从记录中采样来完成此操作。这个想法是取权重的累积和,然后取一个随机值直到最大权重,最后查看哪个记录具有围绕该随机值的累积和。SQL 看起来像:

select top 1 t.*
from (select t.*, cumulative_sum(weight) as cumweight,
             sum(weight) over (partition by NULL) as totalweight
      from t
     ) t
where rand()*(totalweight+1) < cumweight
order by cumweight desc

这样做是创建一个累积权重,然后创建一个随机变量,直到权重的总和。选择 cum weight 小于 sumweight 的最后一条记录。“+1”只是为了确保可以选择任何记录,即使是最后一个。

在 SQL Server 2012 中,您可以使用 SUM() over(按 NULL order by 分区)计算累积 SUM。

在 SQL Server 2012 中,您可以使用: select top 1 t.* from (select t. , sum(weight) over (partition by NULL order by weight) as cumweight, sum(weight) over (partition by NULL) as totalweight from t ) t where rand() (totalweight+1) < cumweight order by cumweight desc

不幸的是,SQL Server 2008 不支持这种语法。在那个数据库中,您需要执行自联接,这是您从原始文章中提取的查询。

于 2012-05-20T03:50:03.810 回答