join - 写入连接表记录的优化

Question

该应用程序基于 PostGIS，并使用 RGeo 的 simple_mercator_factory 存储数据。

创建多边形记录并且与点的关联是静态的（即不需要更新）。为了减少 postGIS 计算的开销，使用属于多边形的点填充连接表并使用 bTree（代替 rTree）对索引连接表进行搜索是有意义的。

问题是有效地创建连接记录。现在：

@line_string1 = RGeo::Geographic.simple_mercator_factory.line_string([@point_a, @point_b, @point_c, @point_d])
@points = Point.all
@points_in ||= []
@points.each do |point|
  this_point = point.lonlat
  @this_poly = RGeo::Geographic.simple_mercator_factory.polygon(@line_string1)
  if this_point.intersects?(@this_poly)
      @add_point = pointpolygon.new(:point_id => point.id, :polygon_id => @polygon.id)
      @add_point.save
  end
end

查询计划是可以接受的

EXPLAIN for: SELECT "point".* FROM "points"
                         QUERY PLAN
-------------------------------------------------------------
 Seq Scan on points  (cost=0.00..210.10 rows=8110 width=99)
(1 row)

但是，该@add_point函数的时钟在 14 到 16 毫秒之间。对于一组 83 条记录，我们看起来大约是 1.6 秒。但是总数不匹配：
Completed 302 Found in 7796.9ms (ActiveRecord: 358.5ms)
运行一个单独的方法来执行相同的查询计划（和时间），而不写入连接记录完成出现
Completed 200 OK in 1317.5ms (Views: 49.8ms | ActiveRecord: 64.0ms)
两个问题。更平凡的是，除了开发模式条件之外，为什么总数会激增这么多 - 我期待大约 3 秒（1.6 + 1.3）？

但更重要的是，有没有办法以更有效的方式将连接表记录写入单独的线程（after_update？）（考虑到可以写入 1000 条记录......）

score 0 · Accepted Answer

正如@Jakub 正确指出的那样，一种一次性提取所有有效点的方法：

def valid_points
  Point.order("id").joins("INNER JOIN points ON points.id=#{id} AND st_contains(polygon.poly, points.lonlat)").all
end

然后由控制器调用

  @valid_points = @polygon.valid_points
  @valid_points.each do |point|
    @add_point = Pointpolygon.new(:point_id => point.id, :polygon_id => @polygon.id)
    @add_point.save
  end

产生更好的响应时间。对于多达 1000 个匹配项的测试用例，在开发模式下，每个记录创建的创建时间在 1.2 到 1.4 毫秒之间。

join - 写入连接表记录的优化

1 回答 1

Related

Reference