sql - SQL使用多个/依赖列计算项目频率？

Question

我对 SQL 完全陌生，并且已阅读有关 SQL 的 StackOverflow 帖子以尝试解决此问题，以及其他来源，但无法在 SQL 中执行此操作。开始...

我有一个 3 列和数千行的表，其中包含前 2 列的数据。第三列当前为空，我需要根据第一列和第二列中已有的数据填充第三列。

假设我在第一列中有状态，在第二列中有水果条目。我需要编写一个 SQL 语句来计算每个水果来自的不同状态的数量，然后将这个受欢迎程度数字插入每一行的第三列。该行中的流行数字 1 表示水果仅来自一个州，流行数字 4 表示水果来自 4 个州。所以我的桌子目前是这样的：

state     fruit     popularity

hawaii    apple     
hawaii    apple     
hawaii    banana       
hawaii    kiwi      
hawaii    kiwi      
hawaii    mango        
florida   apple      
florida   apple        
florida   apple        
florida   orange      
michigan  apple     
michigan  apple     
michigan  apricot   
michigan  orange    
michigan  pear      
michigan  pear      
michigan  pear      
texas     apple     
texas     banana    
texas     banana    
texas     banana    
texas     grape

我需要弄清楚如何计算然后更新第三列，名为流行度，它是出口该水果的州的数量。目标是产生（对不起双关语）下表，根据上表，“苹果”出现在所有 4 个州，橙子和香蕉出现在 2 个州，猕猴桃、芒果、梨和葡萄只出现在 1 个州状态，因此它们相应的流行度数。

state     fruit     popularity

hawaii    apple     4
hawaii    apple     4
hawaii    banana    2   
hawaii    kiwi      1
hawaii    kiwi      1
hawaii    mango     1   
florida   apple     4 
florida   apple     4   
florida   apple     4   
florida   orange    2  
michigan  apple     4
michigan  apple     4
michigan  apricot   1
michigan  orange    2
michigan  pear      1
michigan  pear      1
michigan  pear      1
texas     apple     4
texas     banana    2
texas     banana    2
texas     banana    2
texas     grape     1

我的小程序员大脑说要尝试找出一种在某种脚本中循环遍历数据的方法，但是稍微阅读一下 SQL 和数据库，似乎您不会在 SQL 中编写又长又慢的循环脚本。我什至不确定你是否可以？但相反，在 SQL 中有更好/更快的方法来做到这一点。

任何人都知道如何在 SQL 语句中计算和更新每一行的第三列，这里称为流行度，对应于每个水果来自的状态数？感谢阅读，非常感谢任何帮助。

到目前为止，我已经在下面尝试了这些 SQL 语句，这些语句的输出但并不能完全满足我的需求：

--outputs those fruits appearing multiple times in the table
SELECT fruit, COUNT(*)
  FROM table 
 GROUP BY fruit
HAVING COUNT(*) > 1
 ORDER BY COUNT(*) DESC

--outputs those fruits appearing only once in the table
SELECT fruit, COUNT(*)
  FROM table 
 GROUP BY fruit
HAVING COUNT(*) = 1

--outputs list of unique fruits in the table
SELECT COUNT (DISTINCT(fruit))
  FROM table

score 4 · Accepted Answer

如果您想简单地使用优先级更新您的表，它看起来像：

update my_table x
   set popularity = ( select count(distinct state) 
                        from my_table
                       where fruit = x.fruit )

如果要选择数据，则可以使用分析查询：

select state, fruit
     , count(distinct state) over ( partition by fruit ) as popularity
  from my_table

这提供了每个水果的不同状态的数量。

score 1 · Accepted Answer

我跑了这个，得到（我认为）是你想要的：

WITH t
  AS (SELECT 'hawaii' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'hawaii' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'hawaii' as STATE, 'banana' as fruit FROM dual
      UNION ALL
      SELECT 'hawaii' as STATE, 'kiwi' as fruit FROM dual
      UNION ALL
      SELECT 'hawaii' as STATE, 'kiwi' as fruit FROM dual
      UNION ALL
      SELECT 'hawaii' as STATE, 'mango' as fruit FROM dual
      UNION ALL
      SELECT 'florida' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'florida' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'florida' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'florida' as STATE, 'orange' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'apricot' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'orange' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'pear' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'pear' as fruit FROM dual
      UNION ALL
      SELECT 'michigan' as STATE, 'pear' as fruit FROM dual
      UNION ALL
      SELECT 'texas' as STATE, 'apple' as fruit FROM dual
      UNION ALL
      SELECT 'texas' as STATE, 'banana' as fruit FROM dual
      UNION ALL
      SELECT 'texas' as STATE, 'banana' as fruit FROM dual
      UNION ALL
      SELECT 'texas' as STATE, 'banana' as fruit FROM dual
      UNION ALL
      SELECT 'texas' as STATE, 'grape' as fruit FROM dual)
SELECT state,
       fruit,
       count(DISTINCT state) OVER (PARTITION BY fruit) AS popularity
  FROM t;

回来

florida     apple   4
florida     apple   4
florida     apple   4
hawaii      apple   4
hawaii      apple   4
michigan    apple   4
michigan    apple   4
texas       apple   4
michigan    apricot 1
hawaii      banana  2
texas       banana  2
texas       banana  2
texas       banana  2
texas       grape   1
hawaii      kiwi    1
hawaii      kiwi    1
hawaii      mango   1
florida     orange  2
michigan    orange  2
michigan    pear    1
michigan    pear    1

显然，您只需要运行：

SELECT state,
       fruit,
       count(DISTINCT state) OVER (PARTITION BY fruit) AS popularity
  FROM table_name;

希望能帮助到你...

score 0 · Accepted Answer

另外的选择：

SELECT fruit
,      COUNT(*)
FROM
(
SELECT state
,      fruit
,      ROW_NUMBER() OVER (PARTITION BY state, fruit ORDER BY NULL) rn
FROM   t
)
WHERE rn = 1
GROUP BY fruit
ORDER BY fruit;

score 0 · Accepted Answer

尝试这个

create table states([state] varchar(10),fruit varchar(10),popularity int)
INSERT INTO states([state],fruit) 
VALUES('hawaii','apple'),
('hawaii','apple'),     
('hawaii','banana'),       
('hawaii','kiwi'),      
('hawaii','kiwi'),      
('hawaii','mango'),        
('florida','apple'),      
('florida','apple'),        
('florida','apple'),        
('florida','orange'),      
('michigan','apple'),     
('michigan','apple'),     
('michigan','apricot'),   
('michigan','orange'),    
('michigan','pear'),      
('michigan','pear'),      
('michigan','pear'),      
('texas','apple'),     
('texas','banana'),    
('texas','banana'),    
('texas','banana'),
('texas','grape')

update t set t.popularity=a.cnt
from states t inner join
(SELECT fruit,count(distinct [state]) as cnt
  FROM states
  group by fruit) a
on t.fruit =a.fruit

score 0 · Accepted Answer

如果你的桌子是#fruit...

计算每种水果的不同状态

select fruit, COUNT(distinct state) statecount from #fruit group by fruit

所以用这些值更新表

update #fruit
set popularity
    = statecount
from
 #fruit
    inner join 
      (select fruit, COUNT(distinct state) statecount from #fruit group by fruit) sc
        on #fruit.fruit = sc.fruit

score 0 · Accepted Answer

这应该能让你大部分时间到达那里。基本上，您想获得水果所处的不同状态的计数，然后使用它连接回原始表。

update table
set count = cnt
from 
  (
    select fruit, count(distinct state) as cnt 
    from table
    group by fruit) cnts
  inner join table t
    on cnts.fruit = t.fruit

score 0 · Accepted Answer

尝试这个：

select a.*,b.total
from [table] as a
left join 
(
SELECT fruit,count(distinct [state]) as total
  FROM [table]
  group by fruit
) as b
on a.fruit = b.fruit

请注意，这是 SQL Server 代码，如有必要，请自行调整。

sql - SQL使用多个/依赖列计算项目频率？

7 回答 7

Related

Reference