1

表结构:

 MyTable (
   ID INT AUTO_INCREMENT,
   Num1 INT,
   Num2 INT,
   Num3 INT,
   Num4 INT,
   PRIMARY KEY(ID)
 )engine=InnoDB;

现在我有大约20-30k条记录。Num1, Num2, Num3并且Num4只是一些随机数。我试图从这个表中选择23 编号组合。例如,假设我在表中有以下行:

ID    Num1   Num2   Num3   Num4
1     20     11     9      150
2     30     11     20     19
3     40     45     11     20

我想选择最常用的2 个数字组合,然后选择3 个数字组合。所以请注意,2011在表中出现3次意味着组合20,11 or 11,20无关紧要,订单计数为 3,其他组合以此类推。

我想在 PHP 数组中检索这些数据,以便我可以进行一些计算或在屏幕上显示。

到目前为止我尝试了什么:

 SELECT *
 FROM MyTable
 WHERE (Num1 = :num1 AND Num2 = :num2) OR (Num1 = :num1 AND Num3 = :num2) OR 
       (Num1 = :num1 AND Num4 = :num2) OR (Num2 = :num1 AND Num1 = :num2) OR 
       (Num2 = :num1 AND Num3 = :num2) OR (Num2 = :num1 AND Num4 = :num2) OR 
       ***
       ***

依此类推,适用于所有组合。现在,如果我尝试将它用于3 个数字组合,这会很烦人。

  1. 有没有更好更有效的方法来做到这一点?
  2. 我需要重组表格以使其更容易吗?
  3. 重组表会被规范化吗?(现在我认为是标准化的,如果不是请告诉我)
4

2 回答 2

1

案例2组合

我认为您应该考虑将信息存储在这样的大矩阵中:

num  times_appearing_with_number_1 times_appearing_with_number_2 ...

对于像这样的情况

 1 8 2 3
 1 7 23 24

就像:

 num 1 2 3 4 5 6 ...
 1   - 1 1 0 0 0 ...
 2   1 - 1 0 0 0 ...

然后你检查哪些行有更大的数字。索引对于获取它对应的数字很有用。

案例3组合

3D 矩阵也是如此。

要提供这些表,您应该只从 MySQL 获取信息,然后循环。

于 2013-03-30T22:28:54.243 回答
1

since the order of values doesn't matter, there are only 6 permutations to pick two out of four columns (c1-c2, c1-c3, c1-c4, c2-c3, c2-c4 and c3-c4), and only four permutations to pick three (c1-c2-c3, c1-c2-c4, c1-c3-c4, c2-c3-c4).

One approach would be to create a temporary table which contains the id of the row and all 6 (4 for three cols) permutations of those values. You could use a query like this:

SELECT id, CASE Num1<=Num2 WHEN TRUE THEN CONCAT(Num1,"-",Num2) ELSE CONCAT(Num2,"-",Num1) END FROM MyTable
UNION
SELECT id, CASE Num1<=Num3 WHEN TRUE THEN CONCAT(Num1,"-",Num3) ELSE CONCAT(Num3,"-",Num1) END FROM MyTable
...

All that's left then is counting the number of matching rows (note that above query could either be run manually or as a subquery to the counting query)

Edit: Something to fiddle with.

于 2013-03-30T22:45:28.860 回答