ruby-on-rails - 在 Ruby on Rails 中，在保存到数据库之前汇总文件中的数据

Question

我有一个概念问题；我正在尝试编写代码以从 s3 下载日志，然后解析并将其中一些数据存储在 rails 应用程序内的数据库中。

由于这完全是内部的，我只有一个模型，它具有下载和解析日志所需的代码。我的主要解析方法打开一个文件，遍历每一行解析出我想要保存到数据库的某些数据。

我的目标是汇总一个文件（其中有多个日志）中的所有数据，然后将其保存到数据库中。

我正在努力掌握的是如何在将数据保存到 Rails 中的数据库之前对其进行汇总？

例如，如果我有以下日志：

日志/帐户/6 100
日志/帐户/7 250
日志/帐户/6 50
日志/帐户/5 100

我的目标是遍历所有行并保存每个帐户 ID 的总金额，因此我希望将帐户 6、150 保存为总和。出于某种原因，我只能理解 1 个日志的 1 个数据库条目，而不是汇总文件中的日志并将其转换为 1 个数据库条目。

当前解析过程：

   def self.create_from_log_file(file)
    s3log = File.open(file).each do |line|
    line_match = S3_LINE_REGEXP.match(line)# get the matchdata
    captures = Hash[ line_match.names.zip( line_match.captures ) ]# convert the matchdata to a hash key value pairs (both strings)
    validate_log_file(captures["timestamp"])# validate file is unique
    captures["http_status"] != 200 # figure out if API request was a http 200
    current_account = extract_account_id(captures["request_path"])# extract account id and find that account
    account_log = S3Log.new # instantiate a new S3Log instance
    account_log.account_id = Account.find_by_id(current_account) # assign the S3Log object its account id
    account_log.total_bytes = calculate_total_bytes_for_file(captures["bytes_sent"])# assign the log bytes to that accounts total for the file
    account_log.total_requests = calculate_total_requests_for_file(acount_log.account_id)# calculate total requests for that account on the file
    account_log.date = Date.parse(captures["timestamp"])
  end

  account_log.save!
end

score 0 · Accepted Answer

一些高级指针。首先，由于您的代码可能是一个运行时间较长的作业，因此可能值得使用Resque或Sidekiq将其作为后台作业运行

其次，将您的工作分解为定义明确的小函数，然后为这些较小的函数编写测试。然后，您将有信心将它们组合成更大的部分，即练习功能分解。或者，采用 OO 方式，创建模型来封装解析逻辑，另一个模型来表示感兴趣的行，可能还有第三个模型来表示可以执行聚合方法的行的集合。

希望这可以帮助。

ruby-on-rails - 在 Ruby on Rails 中，在保存到数据库之前汇总文件中的数据

1 回答 1

Related

Reference