ruby-on-rails - 寻找更快的 ActiveRecord 查询（Ruby on Rails）

Question

我有一个按特定顺序排序的模型。我的目标是从模型中找到一条记录，其中所有先前记录的特定列的总和等于某个数字。下面的例子得到了我需要的东西，但它很慢，尤其是在一张相当大的桌子上。有没有更快的方法来解决所有先前产品点的总和 = 100000 的 product.id？

 total_points = 0
 find_point_level = 100000
 @products = Product.order("id").all
 @products.each do |product|
    total_points = product.points + total_points
    @find_product = product.id
    break if total_points >= find_point_level
 end

更新

以下是一些解决方案的一些时间。这将通过大约 60,000 条记录。时间用于 ActiveRecord。

原始示例（上）：
2685.0ms
1238.8ms
1428.0ms

使用 find_each 的原始示例：
799.6ms
799.4ms
797.8ms

用总和创建一个新列：
181.3ms
170.7ms
172.2ms

score 6 · Accepted Answer

您可以尝试对数据库进行非规范化，并将部分总和直接保存在products表中。简单的查询whereandlimit将立即为您返回正确的答案。

您需要创建额外的过滤器，每当添加产品时都会更新单个记录，并且每当产品被删除或它的points字段被更改时都会更新所有产品。

score 1 · Accepted Answer

事实证明，实际上有一种方法可以在 SQL 中执行此操作。首先，让我们设置一些测试环境：

rails new foobar
cd foobar
rails g model Product name:string points:integer
rake db:migrate
rails console

在 Rails 控制台中，向数据库提供一些记录：

Product.new(name: 'Foo',  points: 1).save!
Product.new(name: 'Bar',  points: 2).save!
Product.new(name: 'Baz',  points: 3).save!
Product.new(name: 'Baf',  points: 4).save!
Product.new(name: 'Quux', points: 5).save!

Now i found a way of getting running totals in SQL in this post here. It works like this:

query = <<-SQL
  SELECT *, (
    SELECT SUM(points)
    FROM products
    WHERE id <= p.id
  ) AS total_points
  FROM products p
SQL

Running this query against the test DB gives us:

Product.find_by_sql(query).each do |p|
  puts p.name.ljust(5) + p.points.to_s.rjust(2) + p.total_points.to_s.rjust(3)
end

# Foo   1  1
# Bar   2  3
# Baz   3  6
# Baf   4 10
# Quux  5 15

So we can now use a HAVING clause (and a GROUP BY because this is needed for HAVING)to fetch only the products that match the condition and LIMIT the number of results to one:

query = <<-SQL
  SELECT *, (
    SELECT SUM(points)
    FROM products
    WHERE id <= p.id
  ) AS total_points
  FROM products p
  GROUP BY p.id
  HAVING total_points >= #{find_point_level}
  LIMIT 1
SQL

I'm really curious how this performs in your environment with many many records. Give it a try and tell me if it works for you, if you like.

score 0 · Accepted Answer

this does not really solve the problem, but you can use find_each instead of each to load products in batches instead of loading all the table. see the guides

EDIT ignore the following, i forgot that window functions do not permit WHERE and HAVING clauses

~~if you are willing to use a non db-agnostic solution, you can use this (not tested):~~

~~query = <<-SQL SELECT id, SUM(points) OVER (ORDER BY id) AS total_points FROM products HAVING total_points >= 100000 LIMIT 1 SQL @product = Product.find_all_by_sql( query )~~

this uses window functions that are NOT supported by all RDBMS (Postgresql does). Beware, once you have retrieved the @product, it will be a readonly record with only two attributes accessible: id and total_points

score -2 · Accepted Answer

如果表很大，你可以使用普通的 sql 查询：

find_point_level = 100000
Product.find_all_by_sql("SELECT SUM(points) FROM (SELECT points FROM products ORDER BY id LIMIT #{find_point_level}) AS subquery")

列索引也应该是数据库中存在的索引。

ruby-on-rails - 寻找更快的 ActiveRecord 查询（Ruby on Rails）

4 回答 4

Related

Reference