1

我正在为我们的网站开发一个销售排名小部件,该小部件将在“顶级图表”或“畅销书”列表中显示按当前位置排序的产品(可以这么说)。

经过一番阅读,实现这一点的一个好方法似乎只是一个滚动销售平均算法,其中销售越近,权重越高。

例子:

$rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;

在哪里:

  • $d1 = 过去 24 小时内的销售数量。
  • $d2 = 过去 24-48 小时内的销售数量。
  • $d3 = 过去 48-72 小时内的销售数量。
  • $d4 = 过去 72-96 小时内的销售数量。

等等...

目前,我正在尝试在大约 50 万条记录的产品数据集上运行它,将计算出的排名插入到产品表中,以便以后可以查询。如果可能的话,我希望能够创建一个脚本来重新计算排名并每 12 或 24 小时在 cron 上运行一次。

当前实施:

我当前的实现需要很长时间才能执行,而且我觉得需要在 SQL 级别完成更多的处理(SELECT 查询要少得多),但我不确定如何开始这个。

$products = mysql_query("SELECT * FROM products ORDER BY id DESC"); // <-- Est 450-500k rows.

while($product = mysql_fetch_array($products)) {
    $product_id = $product['id'];

    $d1 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())") or die(mysql_error);
    $d1 = mysql_fetch_array($d1);

    $d2 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))");
    $d2 = mysql_fetch_array($d2);

    $d3 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))");
    $d3 = mysql_fetch_array($d3);

    $d4 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))");
    $d4 = mysql_fetch_array($d4);

    $d5 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))");
    $d5 = mysql_fetch_array($d5);

    $d6 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))");
    $d6 = mysql_fetch_array($d6);

    $d7 = mysql_query("SELECT COUNT(*) FROM orders WHERE (product_id = '$product_id') AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))");
    $d7 = mysql_fetch_array($d7);

    $d1 = $d1[0];
    $d2 = $d2[0];
    $d3 = $d3[0];
    $d4 = $d4[0];
    $d5 = $d5[0];
    $d6 = $d6[0];
    $d7 = $d7[0];       

    $rolling_avg = ((4*$d1)+(3*$d2)+(2*$d3)+$d4+$d5+$d6+$d7)/13;

    mysql_query("UPDATE products SET rolling_sales = '$rolling_avg' WHERE id = '$product_id'");
}

不知道如何从这里优化/进步。但它肯定需要大量的工作。

在提及之前,我了解mysql_*功能已贬值,我将在 PDO 移入生产环境之前将其移至 PDO。

4

1 回答 1

0

这是一个使用单个查询计算滚动销售额的函数。

function get_rolling_sales($product_id) {

    $query = <<<EOF
SELECT (
    4 * (

   SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
   AND (sale_completed BETWEEN (NOW()-INTERVAL 24 HOUR) AND NOW())

) + 3 * (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 48 HOUR) AND (NOW()-INTERVAL 24 HOUR))

) + 2 * (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 72 HOUR) AND (NOW()-INTERVAL 48 HOUR))

) + (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 96 HOUR) AND (NOW()-INTERVAL 72 HOUR))

) + (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 120 HOUR) AND (NOW()-INTERVAL 96 HOUR))

) + (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 144 HOUR) AND (NOW()-INTERVAL 120 HOUR))

) + (

  SELECT COUNT(*) FROM orders WHERE (product_id = $product_id) 
  AND (sale_completed BETWEEN (NOW()-INTERVAL 168 HOUR) AND (NOW()-INTERVAL 144 HOUR))

)

) / 13 AS rolling_sales
EOF;

$result = mysql_query($query);
$row = mysql_fetch_assoc($result);
return $row['rolling_sales'];
}

然而,迭代所有 500.000 条产品记录仍然需要很长时间。您真的需要一次所有这些信息(例如用于计算)还是计划在分页表视图中显示?如果您只想显示数据,则可以按需计算 rolling_sales。

于 2012-12-18T05:19:36.403 回答