问题:为什么并行执行时连接几乎空表的查询的 MySQL 性能下降?
以下是对我所面临问题的更详细说明。我在 MySQL 中有两个表
CREATE TABLE first (
num int(10) NOT NULL,
UNIQUE KEY key_num (num)
) ENGINE=InnoDB
CREATE TABLE second (
num int(10) NOT NULL,
num2 int(10) NOT NULL,
UNIQUE KEY key_num (num, num2)
) ENGINE=InnoDB
第一个包含大约一千条记录。第二个是空的或包含很少的记录。它还包含与问题有关的双索引:单索引问题就消失了。现在我正在尝试对这些表并行进行许多相同的查询。每个查询如下所示:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = -1 # non-existent key
LEFT JOIN second AS second_2 ON second_2.num = -2 # non-existent key
LEFT JOIN second AS second_3 ON second_3.num = -3 # non-existent key
LEFT JOIN second AS second_4 ON second_4.num = -4 # non-existent key
LEFT JOIN second AS second_5 ON second_5.num = -5 # non-existent key
LEFT JOIN second AS second_6 ON second_6.num = -6 # non-existent key
WHERE second_1.num IS NULL
AND second_2.num IS NULL
AND second_3.num IS NULL
AND second_4.num IS NULL
AND second_5.num IS NULL
AND second_6.num IS NULL
我遇到的问题是,在 8 核机器上性能几乎没有线性提升,实际上我有下降。即有一个进程,我每秒的典型请求数约为 200。有两个进程而不是预期的每秒增加 300 - 400 个查询,我实际上下降到 150 个。对于 10 个进程,我只有 70 个查询每秒。我用于测试的 Perl 代码如下所示:
#!/usr/bin/perl
use strict;
use warnings;
use DBI;
use Parallel::Benchmark;
use SQL::Abstract;
use SQL::Abstract::Plugin::InsertMulti;
my $children_dbh;
foreach my $second_table_row_count (0, 1, 1000) {
print '#' x 80, "\nsecond_table_row_count = $second_table_row_count\n";
create_and_fill_tables(1000, $second_table_row_count);
foreach my $concurrency (1, 2, 3, 4, 6, 8, 10, 20) {
my $bm = Parallel::Benchmark->new(
'benchmark' => sub {
_run_sql();
return 1;
},
'concurrency' => $concurrency,
'time' => 3,
);
my $result = $bm->run();
}
}
sub create_and_fill_tables {
my ($first_table_row_count, $second_table_row_count) = @_;
my $dbh = dbi_connect();
{
$dbh->do(q{DROP TABLE IF EXISTS first});
$dbh->do(q{
CREATE TABLE first (
num int(10) NOT NULL,
UNIQUE KEY key_num (num)
) ENGINE=InnoDB
});
if ($first_table_row_count) {
my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
'first',
['num'],
[map {[$_]} 1 .. $first_table_row_count],
);
$dbh->do($stmt, undef, @bind);
}
}
{
$dbh->do(q{DROP TABLE IF EXISTS second});
$dbh->do(q{
CREATE TABLE second (
num int(10) NOT NULL,
num2 int(10) NOT NULL,
UNIQUE KEY key_num (num, num2)
) ENGINE=InnoDB
});
if ($second_table_row_count) {
my ($stmt, @bind) = SQL::Abstract->new()->insert_multi(
'second',
['num'],
[map {[$_]} 1 .. $second_table_row_count],
);
$dbh->do($stmt, undef, @bind);
}
}
}
sub _run_sql {
$children_dbh ||= dbi_connect();
$children_dbh->selectall_arrayref(q{
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = -1
LEFT JOIN second AS second_2 ON second_2.num = -2
LEFT JOIN second AS second_3 ON second_3.num = -3
LEFT JOIN second AS second_4 ON second_4.num = -4
LEFT JOIN second AS second_5 ON second_5.num = -5
LEFT JOIN second AS second_6 ON second_6.num = -6
WHERE second_1.num IS NULL
AND second_2.num IS NULL
AND second_3.num IS NULL
AND second_4.num IS NULL
AND second_5.num IS NULL
AND second_6.num IS NULL
});
}
sub dbi_connect {
return DBI->connect(
'dbi:mysql:'
. 'database=tmp'
. ';host=localhost'
. ';port=3306',
'root',
'',
);
}
对于比较像这样的查询并发执行以提高性能:
SELECT first.num
FROM first
LEFT JOIN second AS second_1 ON second_1.num = 1 # existent key
LEFT JOIN second AS second_2 ON second_2.num = 2 # existent key
LEFT JOIN second AS second_3 ON second_3.num = 3 # existent key
LEFT JOIN second AS second_4 ON second_4.num = 4 # existent key
LEFT JOIN second AS second_5 ON second_5.num = 5 # existent key
LEFT JOIN second AS second_6 ON second_6.num = 6 # existent key
WHERE second_1.num IS NOT NULL
AND second_2.num IS NOT NULL
AND second_3.num IS NOT NULL
AND second_4.num IS NOT NULL
AND second_5.num IS NOT NULL
AND second_6.num IS NOT NULL
测试结果、cpu 和磁盘使用测量在这里:
* 表 `first` 有 1000 行 * 表 `second` 有 6 行:`[1,1],[2,2],..[6,6]` 查询: SELECT first.num 从第一 LEFT JOIN second AS second_1 ON second_1.num = -1 # 不存在的键 LEFT JOIN second AS second_2 ON second_2.num = -2 # 不存在的键 LEFT JOIN second AS second_3 ON second_3.num = -3 # 不存在的键 LEFT JOIN second AS second_4 ON second_4.num = -4 # 不存在的键 LEFT JOIN second AS second_5 ON second_5.num = -5 # 不存在的键 LEFT JOIN second AS second_6 ON second_6.num = -6 # 不存在的键 在哪里 second_1.num 为空 AND second_2.num 为空 AND second_3.num 为空 AND second_4.num 为空 AND second_5.num 为空 AND second_6.num 为空 结果: 并发:1,速度:162.910/秒 并发:2,速度:137.818/秒 并发:3,速度:130.728/秒 并发:4,速度:107.387/秒 并发:6,速度:90.513/秒 并发:8,速度:80.445/秒 并发:10,速度:80.381/秒 并发:20,速度:84.069/秒 在 6 个进程中运行查询的最后 60 分钟后的系统使用情况: $ iostat -cdkx 60 平均 CPU:%user %nice %system %iowait %steal %idle 74.82 0.00 0.08 0.00 0.08 25.02 设备:rrqm/s wrqm/sr/sw/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda1 0.00 0.00 0.00 0.12 0.00 0.80 13.71 0.00 1.43 1.43 0.02 sdf10 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf4 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 30.00 15.00 0.05 sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf8 0.00 0.00 0.00 0.37 0.00 1.24 6.77 0.00 5.00 3.18 0.12 sdf6 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf9 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00 自卫队 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf3 0.00 0.00 0.00 0.08 0.00 1.33 32.00 0.00 4.00 4.00 0.03 sdf2 0.00 0.00 0.00 0.17 0.00 1.37 16.50 0.00 3.00 3.00 0.05 sdf15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf14 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf1 0.00 0.00 0.00 0.05 0.00 0.40 16.00 0.00 0.00 0.00 0.00 sdf13 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf5 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 50.00 25.00 0.08 sdm2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdm1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf12 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf11 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf7 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 md0 0.00 0.00 0.00 0.97 0.00 13.95 28.86 0.00 0.00 0.00 0.00 ################################################# ############################## 查询: SELECT first.num 从第一 LEFT JOIN second AS second_1 ON second_1.num = 1 # 存在键 LEFT JOIN second AS second_2 ON second_2.num = 2 # 存在键 LEFT JOIN second AS second_3 ON second_3.num = 3 # 存在键 LEFT JOIN second AS second_4 ON second_4.num = 4 # 存在键 LEFT JOIN second AS second_5 ON second_5.num = 5 # 存在键 LEFT JOIN second AS second_6 ON second_6.num = 6 # 存在键 在哪里 second_1.num 不为空 AND second_2.num 不为空 AND second_3.num 不为空 并且 second_4.num 不为空 并且 second_5.num 不为空 并且 second_6.num 不为空 结果: 并发:1,速度:875.973/秒 并发:2,速度:944.986/秒 并发:3,速度:1256.072/秒 并发:4,速度:1401.657/秒 并发:6,速度:1354.351/秒 并发:8,速度:1110.100/秒 并发:10,速度:1145.251/秒 并发:20,速度:1142.514/秒 在 6 个进程中运行查询的最后 60 分钟后的系统使用情况: $ iostat -cdkx 60 平均 CPU:%user %nice %system %iowait %steal %idle 74.40 0.00 0.53 0.00 0.06 25.01 设备:rrqm/s wrqm/sr/sw/s rkB/s wkB/s avgrq-sz avgqu-sz await svctm %util sda1 0.00 0.00 0.00 0.02 0.00 0.13 16.00 0.00 0.00 0.00 0.00 sdf10 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf4 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf8 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdf6 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00 sdf9 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 自卫队 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf3 0.00 0.00 0.00 0.13 0.00 2.67 40.00 0.00 3.75 2.50 0.03 sdf2 0.00 0.00 0.00 0.23 0.00 2.72 23.29 0.00 2.14 1.43 0.03 sdf15 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf14 0.00 0.00 0.00 0.98 0.00 0.54 1.10 0.00 2.71 2.71 0.27 sdf1 0.00 0.00 0.00 0.08 0.00 1.47 35.20 0.00 8.00 6.00 0.05 sdf13 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf5 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 sdm2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdm1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf12 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdf11 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 0.00 0.00 0.00 sdf7 0.00 0.00 0.00 0.03 0.00 1.07 64.00 0.00 10.00 5.00 0.02 md0 0.00 0.00 0.00 1.70 0.00 15.92 18.74 0.00 0.00 0.00 0.00 ################################################# ############################## 这个服务器有很多可用内存。顶部示例: top - 19:02:59 up 4:23, 4 users, load average: 4.43, 3.03, 2.01 任务:共 218 个,运行 1 个,睡眠 217 个,停止 0 个,僵尸 0 个 CPU:72.8%us、0.7%sy、0.0%ni、26.3%id、0.0%wa、0.0%hi、0.0%si、0.1%st 内存:总计 71701416k,已使用 22183980k,可用 49517436k,284k 缓冲区 交换:总共 0k,使用 0k,免费 0k,缓存 1282768k PID 用户 PR NI VIRT RES SHR S %CPU %MEM TIME+ 命令 2506 mysql 20 0 51.7g 17g 5920 S 590 25.8 213:15.12 mysqld 9348 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl 9349 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.44 perl 9350 topadver 20 0 72256 11m 1428 S 2 0.0 0:01.45 perl 9351 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 9352 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 9353 topadver 20 0 72256 11m 1428 S 1 0.0 0:01.44 perl 9346 顶部 20 0 19340 1504 1064 R 0 0.0 0:01.89 顶部
有谁知道为什么使用不存在的键进行查询会降低性能?