0

我正在编写一个 TFIDF 程序 - 所有这些都应该没问题,但是我在哈希按预期工作时遇到了一个小(或大..)问题。

为了简短起见,手头的代码是:

#Word matrix is an array that contains hashes (obviously)
#i've done some stuff before this and these are working as expected
puts word_matrix[3][:yahoo] # => 2
puts word_matrix[100][:yahoo] # => 0
puts $total_words_hash[:yahoo] #=> 0 

#Essentially, this block is taking a hash of all the words (values = 0) and trying
#to run through them adding the only the values of the other hash to the temporary
#and then setting the temp to the old hash position (so that there are 0 values
#and the values occurring in that document.. yet, it assigns the same values to
#ALL of the hashes of word_matrix[]

#now we run this block and everything breaks down for some reason..
for i in 0...word_matrix.size
  tmp_complete_words_hash = $total_words_hash #all values should be zero...
  word_matrix[i].each do |key,val| #for each key in the hash we do this..
    tmp_complete_words_hash[key] = val
  end
  word_matrix[i] = tmp_complete_words_hash
end
puts word_matrix[3][:yahoo] # => 2
puts word_matrix[100][:yahoo] # => 2 -- THIS SHOULD BE 0 Still...

谁能解释为什么要为数组的所有哈希分配相同的值?好像tmp_complete_words_hash不是每次都被重置。

4

3 回答 3

2

您需要克隆哈希。

tmp_complete_words_hash = $total_words_hash.clone

否则,两个变量都指向同一个散列,并且您不断地修改该散列。

事实上,Ruby 中的大多数对象都是这样的。只有少数(例如数字、字符串)不是。

在 IRB 中试试这个:

class MyClass
    attr_accessor :value
end

x = MyClass.new
y = x
x.value = "OK"
puts y.value
于 2013-07-09T01:35:02.543 回答
0

为什么要为数组的所有哈希分配相同的值?

只有一个哈希值。$total_words_hash您正在为数组中的每个元素分配相同的散列(由 指向的散列):

tmp_complete_words_hash = $total_words_hash

在这里,您tmp_complete_words_hash指向同一个对象$total_words_hash

word_matrix[i] = tmp_complete_words_hash

在这里,您将该哈希分配给数组的每个元素。

于 2013-07-09T01:36:35.190 回答
0

当您将哈希变量分配给另一个哈希变量时。它将引用相同的内存位置,如果您更改一个哈希,相同的将反映到另一个哈希。

total_words_hash = {}
tmp_complete_words_hash = total_words_hash
1.9.3 (main):0 > total_words_hash.object_id
=> 85149660
1.9.3 (main):0 > tmp_complete_words_hash.object_id
=> 85149660
total_words_hash[:test] = 0
1.9.3 (main):0 > tmp_complete_words_hash
=> {
    :test => 0
}
1.9.3 (main):0 > tmp_complete_words_hash[:test_reverse] = 1
=> 1
1.9.3 (main):0 > tmp_complete_words_hash
=> {
      :test => 0,
      :test_reverse => 1
}

因此,您可以使用 hash 方法为此目的创建一个重复的散列dup

1.9.3 (main):0 > tmp_complete_words_hash = total_words_hash.dup
1.9.3 (main):0 > total_words_hash.object_id
=> 85149660
1.9.3 (main):0 > tmp_complete_words_hash.object_id
=> 97244920

在你的情况下,只需使用。

tmp_complete_words_hash = $total_words_hash.dup
于 2013-07-09T07:17:05.497 回答