1

我正在尝试比较 2 个具有相同方案和列的 mysql 表。该表包含 ip 地址、它们通信的端口、是否是入站或出站以及 1 个月内的连接数。下面是一个包含数字的小示例表(实际表大约有 100k 行)。

+---------------+------+----------+-------------+-----------+
| ip_address    | port | protocol | connections | direction |
+---------------+------+----------+-------------+-----------+
| 123.17.19.6    | 123  | 17       | 31972       | IN        |
| 123.17.19.6    | 22   | 6        | 4           | IN        |
| 123.17.19.6    | 25   | 6        | 206969      | IN        |
| 123.17.19.10   | 135  | 6        | 2997        | OUT       |
| 123.17.19.10   | 389  | 17       | 4965        | OUT       |
| 123.17.19.10   | 389  | 6        | 7089        | OUT       |
| 123.17.19.11   | 139  | 6        | 1           | OUT       |
| 123.17.19.10   | 135  | 6        | 1102        | OUT       |
| 123.17.19.11   | 389  | 17       | 2993        | OUT       |
| 123.17.19.11   | 389  | 6        | 1629        | OUT       |
| 123.17.19.11   | 443  | 6        | 28          | OUT       |
| 123.17.19.11   | 445  | 6        | 4267        | OUT       |
| 123.17.19.11   | 53   | 17       | 5230        | OUT       |
| 123.17.19.11   | 53   | 6        | 10          | OUT       |
| 123.17.19.11   | 80   | 6        | 11          | OUT       |
| 123.17.19.12   | 135  | 6        | 1640        | OUT       |
| 123.17.19.12   | 22   | 6        | 2           | OUT       |
| 123.17.19.10   | 22   | 6        | 6           | OUT       |
| 123.17.19.12   | 389  | 17       | 2539        | OUT       |
+---------------+------+----------+-------------+-----------+

我想做的是比较 2 个月,看看哪些 IP、端口、原型和方向组合是新的/不再存在的,对于任何匹配项,请查看连接数的变化

我最初的想法是简单地遍历每一行,然后对另一个表运行查询以查看该连接是否存在,但这会导致运行数十万个查询。我只是觉得必须有一种更简单的方法来做到这一点。(下面的例子)

use strict;
use warnings;
use DBI;

my ($db1_list,$db2_list,@compare_list1,@compare_list2);
my $db1 = "Jan";
my $db2 = "Feb";

$db2_list = login()->prepare(qq(select * from $db2));
$db2_list->execute;

while (@compare_list2 = $db2_list->fetchrow()){
  $db1_list = login()->prepare(qq(select * from $db2 where ip_address = "@compare_list2[0]" and port = @compare_list2[1] and protocol = @compare_list2[2] and direction = "@compare_list2[4]"));
  $db1_list->execute;

  while (@compare_list1 = $db1_list->fetchrow()){
    if (@compare_list1[0] ~~ @compare_list2[0]);
      @compare_list[3] -= @compare_list[3];
      print "@compare_list[3]\n";
    }
    else {
      print "@compare_list2[0], @compare_list2[1], @compare_list2[2], @compare_list2[3], @compare_list2[4] was seen in $db2 and not in $db1\n";
    }
  }
}
4

2 回答 2

3

MySQL 可以在单个查询中执行此操作:

SELECT *
FROM Feb
WHERE NOT EXISTS (SELECT 1 FROM Jan
    WHERE Feb.ip = Jan.ip
    AND Feb.protocol = Jan.protocol
    AND Feb.direction = Jan.direction
)

现在你有一个列表,Feb其中没有匹配的所有内容month2(所以它是 2 月的“新”)。

于 2012-05-02T21:11:08.047 回答
1

问题 1: 获取匹配的行。

这将为所有比较的列返回具有相同值的行:

select * 
from table1
inner join table2
using (... column names ...);

或者

select * 
from table1
inner join table2
on table1.<field> = table2.<field> and ...;

问题2:

您可以使用第 1 部分作为子查询来实现关系减法,回答“哪些行是新的/缺少的”问题:

select * 
from table1
left join ( ... subquery ...) as sq
on ... join condition ...
where ... <some fields in the subquery are null>;

这是有效的,因为在table子查询中没有匹配的行将NULL在子查询列中具有 s。

于 2012-05-02T21:18:10.783 回答