1

我有一个非常大的结果集,其中包含近 2 GB 的产品数据,分布在多个表中,每个表总共有大约 500,000 条记录。我需要处理每条记录以导出到一组文件。

以下将在服务器尝试保存结果集时崩溃,因此我不得不切换到仅创建一个查询以仅获取与查询结果匹配的每条记录的主 id,然后对每个主 id 执行第二次查询以得到那个单独的产品。由于所有这些辅助查询,这是非常低效和数据库密集型的。

这是导致它崩溃的查询和代码。我怎么能不呢?

$query =
    "SELECT SQL_NO_CACHE SQL_BIG_RESULT
        products.*,
        inventory.*,
        pricing.*,
        markets.*
    FROM
        products,
        categories,
        markets,
        pricing,
        inventory
    WHERE
        products.catid = categories.id AND
        markets.id = products.marketid AND
        pricing.productid = products.id AND
        inventory.productid = products.id AND
        inventory.all_stock > 0 AND
        products.sale = 'Y' AND
        categories.active = 'Y' AND
        inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
    GROUP BY
        products.id";

$Db = new DbConnector();

$r = $Db->query($query); // !Never gets past this point!

while ($product = $r->fetch(PDO::FETCH_ASSOC)) {
    // Stuff gets done here.
}
4

2 回答 2

0

您不能将 id 字段放入临时表中,然后“水合”并分批处理整行吗?

首先是只有 id 的临时表:

CREATE TEMPORARY TABLE tempy
SELECT SQL_NO_CACHE SQL_BIG_RESULT
    products.id  AS product_id,
    inventory.id AS inventory_id,
    pricing.id   AS pricing_id,
    markets.id   AS markets_id
FROM
    products,
    categories,
    markets,
    pricing,
    inventory
WHERE
    products.catid = categories.id AND
    markets.id = products.marketid AND
    pricing.productid = products.id AND
    inventory.productid = products.id AND
    inventory.all_stock > 0 AND
    products.sale = 'Y' AND
    categories.active = 'Y' AND
    inventory.last_update > UNIX_TIMESTAMP(NOW() - INTERVAL 1 DAY)
GROUP BY
    products.id

重复此查询,直到处理完所有内容,但在每一步中增加 OFFSET 值:

SELECT SQL_NO_CACHE SQL_BIG_RESULT
    products.*,
    inventory.*,
    pricing.*,
    markets.*
FROM
    ( SELECT *
      FROM tempy
      LIMIT  1000     -- slice size
      OFFSET 1000*123 -- slice number
      ORDER BY whatever.you.want
    ) AS t,
    products,
    inventory,
    pricing,
    markets
WHERE
    products.id  = t.products_id
    inventory.id = t.inventory_id
    pricing.id   = t.pricing_id
    markets.id   = t.markets_id
于 2013-10-03T16:30:01.080 回答
0

查询是否单独在数据库服务器上运行?如果是这样,瓶颈很可能出在您的 Web 服务器上,并且是与您的数据库服务器的通信。如果您要提取大量数据或被迫运行大量查询(如果您必须为检索到的每个 id 运行额外查询),我建议使用STORED PROCEDURES(mysql 称它们为“例程”)。你可以从这里开始:http: //net.tutsplus.com/tutorials/an-introduction-to-stored-procedures/

于 2013-10-03T15:52:16.843 回答