语境:
我正在尝试进行一系列市场交易,并确定每种物品类型实际移动的金额。这几乎是我第一次尝试 MySql,所以查询很丑,但以下几乎可以工作:
SELECT types.typename,
averages.type,
averages.price,
movement.sold,
( averages.price * movement.sold ) AS value
FROM (SELECT type,
Round(Avg(price)) AS price
FROM orders
GROUP BY type) AS averages
INNER JOIN (SELECT type,
( startingvolume - currentvolume ) AS sold
FROM (SELECT type,
Sum(volume) AS currentVolume,
Sum(volumeentered) startingVolume
FROM orders
GROUP BY type) AS movement
WHERE ( startingvolume - currentvolume ) > 10000
ORDER BY sold) AS movement
ON averages.type = movement.type
INNER JOIN invtypes AS types
ON types.typeid = averages.type
ORDER BY value DESC
LIMIT 10 ;
-
+------------------------------------+-------+---------+------------+------------------+
| typeName | type | price | sold | value |
+------------------------------------+-------+---------+------------+------------------+
| Dirt | 34 | 1904767 | 2670581874 | 5086836224393358 |
| Light Wood | 2629 | 42999 | 2756595 | 118530828405 |
| Dark Wood | 24509 | 47344 | 1107771 | 52446310224 |
| Stone | 21922 | 18386 | 1505884 | 27687183224 |
| Grass | 238 | 5643 | 4554470 | 25700874210 |
| Paper | 3814 | 25635 | 861006 | 22071888810 |
| Iron | 3699 | 320270 | 58833 | 18842444910 |
| Ink | 16275 | 8552 | 2200545 | 18819060840 |
| Loam | 2679 | 5759 | 2608771 | 15023912189 |
| Copper | 672 | 904612 | 14989 | 13559229268 |
+------------------------------------+-------+---------+------------+------------------+
上述数据的问题在于原始市场数据不可避免地被异常值破坏,如下所示:
select type, price from orders where type = 34 order by price desc limit 10;
-
+------+-----------+
| type | price |
+------+-----------+
| 34 | 200000000 |
| 34 | 15.99 |
| 34 | 12.06 |
| 34 | 10 |
| 34 | 7.67 |
| 34 | 7.5 |
| 34 | 7.3 |
| 34 | 7.17 |
| 34 | 7.1 |
| 34 | 7.06 |
+------+-----------+
核心问题:
99%的市场数据是干净的,但是异常值破坏了平均值,MySql似乎没有中值功能。我找到了几个如何找到整个列的中位数的示例,但我需要每个项目的中位数。
我将如何确定每个项目的中位数而不是每个项目的平均值,或者在运行主查询之前有效地清理这些异常值的数据?
注意:我尝试通过 std 省略结果,但商品的价格从 $17 到 $10B 不等,而无论价格范围如何,偏差仍然相对较低。