真的可以在这里使用一些帮助。难以显示包含大数据的仪表板。
使用@ 2k 记录时,平均@ 2 秒。
MySql Console 中的查询不到 3.5 秒即可返回 150k 行。Ruby 中的相同查询从执行查询到所有对象都准备好需要超过 4 分钟。
目标:在添加缓存服务器之前进一步优化数据。使用 Ruby 1.9.2、Rails 3.0 和 Mysql (Mysql2 gem)
问题:
- 使用哈希会损害性能吗?
- 我是否应该首先将所有内容放在一个主哈希中,然后再操作我需要的数据?
- 我还能做些什么来帮助提高性能吗?
数据库中的行:
- GasStations 和美国人口普查有 @ 150,000 条记录
- 人有@ 100,000条记录
- 汽车有@ 200,000条记录
- FillUps有@ 230万
仪表板必需(基于过去 24 小时、上周等时间段的查询)。JS 以 JSON 格式返回的所有数据。
- 加油站,带有加油站和美国人口普查数据(邮政编码、名称、城市、人口)
- 填满最多的前 20 个城市
- 加油量排名前 10 位的汽车
- 汽车按油箱加满次数分组
代码(6 个月的样本。返回 @ 100k + 记录):
# for simplicity, removed the select clause I had, but removing data I don't need like updated_at, gas_station.created_at, etc. instead of returning all the columns for each table.
@primary_data = FillUp.includes([:car, :gas_staton, :gas_station => {:uscensus}]).where('fill_ups.created_at >= ?', 6.months.ago) # This would take @ 4 + minutes
# then tried
@primary_data = FillUp.find_by_sql('some long sql query...') # took longer than before.
# Note for others, sql query did some pre processing for me which added attributes to the return. Query in DB Console took < 4 seconds. Because of these extra attributes, query took longer as if Ruby was checking each row for mapping attributes
# then tried
MY_MAP = Hash[ActiveRecord::Base.connection.select_all('SELECT thingone, thingtwo from table').map{|one| [one['thingone'], one['thingtwo']]}] as seen http://stackoverflow.com/questions/4456834/ruby-on-rails-storing-and-accessing-large-data-sets
# that took 23 seconds and gained mapping of additional data that was processing later, so much faster
# currently using below which takes @ 10 seconds
# All though this is faster, query still only takes 3.5 seconds, but parsing it to the hashes does add overhead.
cars = {}
gasstations = {}
cities = {}
filled = {}
client = Mysql2::Client.new(:host => "localhost", :username => "root")
client.query("SELECT sum(fill_ups_grouped_by_car_id) as filled, fillups.car_id, cars.make as make, gasstations.name as name, ....", :stream => true, :as => :json).each do |row|
# this returns fill ups gouged by car ,fill_ups.car_id, car make, gas station name, gas station zip, gas station city, city population
if cities[row['city']]
cities[row['city']]['fill_ups'] = (cities[row['city']]['fill_ups'] + row['filled'])
else
cities[row['city']] = {'fill_ups' => row['filled'], 'population' => row['population']}
end
if gasstations[row['name']]
gasstations[row['name']]['fill_ups'] = (gasstations[row['name']]['fill_ups'] + row['filled'])
else
gasstations[row['name']] = {'city' => row['city'],'zip' => row['city'], 'fill_ups' => row['filled']}
end
if cars[row['make']]
cars[row['make']] = (cars[row['make']] + row['filled'])
else
cars[row['make']] = row['filled']
end
if row['filled']
filled[row['filled']] = (filled[row['filled']] + 1)
else
filled[row['filled']] = 1
end
end
有以下型号:
def Person
has_many :cars
end
def Car
belongs_to :person
belongs_to :uscensus, :foreign_key => :zipcode, :primary_key => :zipcode
has_many :fill_ups
has_many :gas_stations, :through => :fill_ups
end
def GasStation
belongs_to :uscensus, :foreign_key => :zipcode, :primary_key => :zipcode
has_many :fill_ups
has_many :cars, :through => :fill_ups
end
def FillUp
# log of every time a person fills up there gas
belongs_to :car
belongs_to :gas_station
end
def Uscensus
# Basic data about area based on Zip code
end