0

我编写了一个程序,它可以找到一个单独的 txt 文件中的大型数据集的平均值和标准偏差。我希望这个程序可以处理任何数据集。我通过输入两个简单的数据点(与温度相关的年份和月份)来测试我的程序:

2009-11,20
2009-12,10

运行它时,它说我的平均值是 20,标准偏差是 0(显然是错误的)。

这是我的程序:

data = File.open("test.txt", "r+")
contents = data.read

contents = contents.split("\r\n")

#split up array
contents.collect! do |x|
  x.split(',')
end

sum = 0

contents.each do |x|
  #make loop to find average
  sum = sum  + x[1].to_f
end
avg = sum / contents.length
puts "The average of your large data set is: #{ avg.round(3)} (Answer is rounded to nearest thousandth place)"
#puts average

#similar to finding average, this finds the standard deviation
variance = 0
contents.each do |x|
  variance = variance + (x[1].to_f - avg)**2
end

variance = variance / contents.length
variance = Math.sqrt(variance)
puts "The standard deviation of your large data set is:#{ variance.round(3)} (Answer is rounded to nearest thousandth place)"
4

1 回答 1

1

我认为问题来自于使用\r\n依赖于操作系统的数据拆分:如果你在 Linux 上,它应该是contents.split('\n'). 无论哪种方式,您最好使用IO#each遍历文件中的每一行并让 Ruby 处理行结束字符。

data = File.open("test.txt", "r+")

count = 0
sum = 0
variance = 0

data.each do |line|
  value = line.split(',')[1]
  sum = sum  + value.to_f
  count += 1
end

avg = sum / count
puts "The average of your large data set is: #{ avg.round(3)} (Answer is rounded to nearest thousandth place)"

# We need to get back to the top of the file
data.rewind

data.each do |line|
  value = line.split(',')[1]
  variance = variance + (value.to_f - avg)**2
end

variance = variance / count
variance = Math.sqrt(variance)
puts "The standard deviation of your large data set is: #{ variance.round(3)} (Answer is rounded to nearest thousandth place)"
于 2013-10-22T08:18:14.797 回答