4

StackOverflow 的救援!,我需要在一个查询调用中一次找到五列的中位数。

下面的中位数计算适用于单列,但当组合使用时,“rownum”的多次使用会导致查询关闭。如何更新它以适用于多个列?谢谢你。它是创建一个网络工具,非营利组织可以在其中将其财务指标与用户定义的同行群体进行比较。

SELECT t1_wages.totalwages_pctoftotexp AS median_totalwages_pctoftotexp
FROM (

SELECT @rownum := @rownum +1 AS  `row_number` , d_wages.totalwages_pctoftotexp
FROM data_990_c3 d_wages, (

SELECT @rownum :=0
)r_wages
WHERE totalwages_pctoftotexp >0
ORDER BY d_wages.totalwages_pctoftotexp
) AS t1_wages, (

SELECT COUNT( * ) AS total_rows
FROM data_990_c3 d_wages
WHERE totalwages_pctoftotexp >0
) AS t2_wages
WHERE 1 
AND t1_wages.row_number = FLOOR( total_rows /2 ) +1

--- [that was one median, below is another] ---

SELECT t1_solvent.solvent_days AS median_solvent_days
FROM (

SELECT @rownum := @rownum +1 AS  `row_number` , d_solvent.solvent_days
FROM data_990_c3 d_solvent, (

SELECT @rownum :=0
)r_solvent
WHERE solvent_days >0
ORDER BY d_solvent.solvent_days
) AS t1_solvent, (

SELECT COUNT( * ) AS total_rows
FROM data_990_c3 d_solvent
WHERE solvent_days >0
) AS t2_solvent
WHERE 1 
AND t1_solvent.row_number = FLOOR( total_rows /2 ) +1

[那是两个-总共有五个我最终需要一次找到中位数]

4

2 回答 2

2

这种事情在 MySQL 中是一件很头疼的事情。如果您要进行大量的统计排名工作,使用免费的 Oracle Express Edition 或 postgreSQL 可能是明智之举。它们都具有MEDIAN(value)内置或可作为扩展的聚合函数。这是一个小 sqlfiddle 展示了这一点。 http://sqlfiddle.com/#!4/53de8/6/0

但你没有问这个。

在 MySQL 中,您的基本问题是 @rownum 等变量的范围。您还有一个旋转问题:也就是说,您需要将查询的行转换为列。

让我们先解决枢轴问题。你要做的是创建几个大查询的联合。例如:

SELECT 'median_wages' AS tag, wages AS value
  FROM (big fat query making median wages) A
 UNION
SELECT 'median_volunteer_hours' AS tag, hours AS value
  FROM (big fat query making median volunteer hours) B
 UNION
SELECT 'median_solvent_days' AS tag, days AS value
  FROM (big fat query making median solvency days) C

因此,这是您在标签/值对表中的结果。您可以像这样旋转该表,以获得每列中具有值的一行。

SELECT SUM( CASE tag WHEN 'median_wages' THEN value ELSE 0 END 
          ) AS median_wages, 
SELECT SUM( CASE tag WHEN 'median_volunteer_hours' THEN value ELSE 0 END
          ) AS median_volunteer_hours, 
SELECT SUM( CASE tag WHEN 'median_solvent_days' THEN value ELSE 0 END 
          ) AS median_solvent_days
FROM (
    /* the above gigantic UNION query */
 ) Q

这就是您将行(在本例中从 UNION 查询)向上旋转到列的方式。这是有关该主题的教程。 http://www.artfulsoftware.com/infotree/qrytip.php?id=523

现在我们需要处理中值计算子查询。您问题中的代码看起来不错。我没有你的数据,所以我很难评估它。

但是您需要避免重复使用 @rownum 变量。在您的一个查询中调用它@rownum1,在下一个查询中调用它@rownum2,依此类推。这是一个极小的 sql fiddle 只做其中一个。 http://sqlfiddle.com/#!2/2f770/1/0

现在让我们建立一点,做两个不同的中位数。这是小提琴http://sqlfiddle.com/#!2/2f770/2/0,这是 UNION 查询。 请注意联合查询的后半部分使用@rownum2而不是@rownum.

最后,这是带有旋转的完整查询。 http://sqlfiddle.com/#!2/2f770/13/0

 SELECT SUM( CASE tag WHEN 'Boston' THEN value ELSE 0 END ) AS Boston,
           SUM( CASE tag WHEN 'Bronx' THEN value ELSE 0 END ) AS Bronx   
   FROM (
 SELECT 'Boston' AS tag, pop AS VALUE
  FROM (
        SELECT @rownum := @rownum +1 AS  `row_number` , pop
          FROM pops, 
        (SELECT @rownum :=0)r
          WHERE pop >0 AND city = 'Boston'
          ORDER BY pop
        ) AS ordered_rows, 
        ( 
         SELECT COUNT( * ) AS total_rows
           FROM pops
          WHERE pop >0 AND city = 'Boston'
        ) AS rowcount
  WHERE ordered_rows.row_number = FLOOR( total_rows /2 ) +1
  UNION ALL
 SELECT 'Bronx' AS tag, pop AS VALUE
  FROM (
        SELECT @rownum2 := @rownum2 +1 AS  `row_number` , pop
          FROM pops, 
        (SELECT @rownum2 :=0)r
          WHERE pop >0 AND city = 'Bronx'
          ORDER BY pop
        ) AS ordered_rows, 
        ( 
         SELECT COUNT( * ) AS total_rows
           FROM pops
          WHERE pop >0 AND city = 'Bronx'
        ) AS rowcount
  WHERE ordered_rows.row_number = FLOOR( total_rows /2 ) +1
) D

这只是两个中位数。你需要五个。我认为很容易证明这种中值计算在 MySQL 中的单个查询中很难完成。

于 2013-07-06T19:10:57.430 回答
0

假设您有一个包含三列的表,例如 table(key, value1, value2)。

此查询为您提供每个键的两个值列的中值:

SELECT key,
 ((array_agg(value1 order by value1 asc) )[floor( (count(*)+1)::float/2)] + (array_agg(value1 order by value1 asc) )[ceiling( (count(*)+1)::float/2) ] )/2,
 ((array_agg(value2 order by value2 asc) )[floor( (count(*)+1)::float/2)] + (array_agg(value2 order by value2 asc) )[ceiling( (count(*)+1)::float/2) ] )/2    
FROM table 
GROUP BY key
于 2016-04-28T00:30:26.623 回答