0

我有一个庞大的查询,用于在很多表(每个表都有数千行)上执行 UNION ALL,然后在返回之前输出到一个临时表。

旧形式:

 SELECT *
FROM   (SELECT `a` AS `Human readable A`,
               `b` AS `Human readable B`,
               `c` AS `Human readable C`,
        FROM   `table1`
        UNION ALL
        SELECT
               `a` AS `Human readable A`,
               `b` AS `Human readable B`,
               `c` AS `Human readable C`,
        FROM   `table2`
        UNION ALL
        SELECT
              `a` AS `Human readable A`,
              `b` AS `Human readable B`,
              `c` AS `Human readable C`,

        FROM `table3`
) AS temp_table 

这个查询几乎杀死了数据库(查询需要 20 分钟到 61 分钟之间的任何时间),在此期间 CPU 完全耗尽。

我发现为每个表运行单独的 SELECT 语句最多只需要几秒钟,并决定在应用程序级别将它们合并在一起,该应用程序级别位于不同的物理服务器上,这是一个额外的好处(下面的伪代码)。

    $result1 =  SELECT
                      `a` AS `Human readable A`,
                      `b` AS `Human readable B`,
                      `c` AS `Human readable C`,

                FROM `table1`

    $result2 =  SELECT
                      `a` AS `Human readable A`,
                      `b` AS `Human readable B`,
                      `c` AS `Human readable C`,

                FROM `table2`

    $result3 =  SELECT
                      `a` AS `Human readable A`,
                      `b` AS `Human readable B`,
                      `c` AS `Human readable C`,

                FROM `table3`

$result4 = merge($result1, $result2, $result3)

但是,我觉得这有点不安全,因为查询可能会在这些单独的选择查询之间更新数据。有没有办法改进我的一组 select 语句查询,使其仅被视为一个事务(不需要写入),因此所有数据都将被共享读锁锁定并返回。

附加信息

我预测原始表单花费的时间要长得多,因为它花费了大量 CPU 时间重新创建/排序组合表中的索引,这是我不需要做的事情(我只需要将结果附加在一起)。

  • 所有表都具有完全相同的结构
  • 请注意,每个表大约有 34 个aAS Human readable A,数据被分成不同的表,因为它们涉及不同的项目。
  • 此特定查询中有 20 个联合(21 个表)。
  • 对数据使用 InnoDB 表。我知道这在 CPU 上比 MyIsam 更密集,但是在阅读了 MyIsam 的各种缺点之后,我不愿意切换存储引擎。
  • 没有 WHERE 子句(数据已经“预先分组”,已被拆分为表)
4

2 回答 2

0

考虑到您的约束,最好的调用是在发出每个连续的之前明确锁定表SELECT

SET autocommit=0; -- optional, but this is where and how you must start the transaction if you need one
LOCK TABLES t1 READ, t2 READ, t3 READ;
SELECT a FROM t1;
SELECT a FROM t2;
SELECT a FROM t3;
UNLOCK TABLES; -- beware: implicit COMMIT

除非有某种法律要求将此数据保存在多个表中,否则您确实应该坚持验证将所有这些表合并到一个表中。

于 2013-11-08T11:42:41.017 回答
0

我想我会通过代码示例提供两种可能的解决方案及其各种好处。其中一个解决方案是从 RandomSeed 的回答中“偷走”的:

if ($READING_ONLY_INNODB_TABLES)
{
    /**
     * - Since these tables are innodb, we can make use of its 'REPEATABLE READ'
     * isolation level
     * - Locking the entire tables directly is slightly faster, but this method 
     * allows us to have a consistent view of the database without implementing
     * ANY locks (which would block other processes). 
     * It may be easier to think of them as locking as this results in the same 
     * result in terms of consistency (innodb even handles phantom reads at this level)
     * - MyIsam does not support REPEATABLE READ, hence this cannot be used for it
     */
   $query = 
       'START TRANSACTION WITH CONSISTENT SNAPSHOT;'. # --auto sets "SET autocommit=0"
       $queries_string . # --This is the series of selects
       'COMMIT;';
}
else
{
    /**
     * This is a lower resource intensive, 'generic' way (works with MyISAM) that will wait until it can read lock
     * all the tables before reading. This way we should 'force' a 
     * 'repeatable read'/consitent view.
     */
    $query = 
        'SET autocommit=0;'. # starts the transaction
        'LOCK TABLES ' . $lock_tables_string . ';' . # Automatically commits anything before this
        $queries_string . # This is the series of selects from the tables we just locked
        'UNLOCK TABLES;'; # commits a transaction if any tables currently have been locked with LOCK TABLES    
}
于 2013-11-08T15:58:51.777 回答