0
REMOVE_WORDS_ARRAY = ["llc", "co", "corp", "inc", "the"]

businesses_array = import_csv.import('businesses.csv')
print businesses_array
# [["the bakery", "10012"]["law office inc", "10014"]]

businesses_hashes = []
our_hash = {}

businesses_array.each do |business|
  our_hash['BusinessName']  = business[0].strip unless business[0].nil?
  our_hash['BusinessZipCode'] = business[1].strip unless business[1].nil?

  our_hash.each {|key, value|
    our_hash[key] = value.downcase!
    our_hash[key] = (value.split(' ') - REMOVE_WORDS_ARRAY) # only this part doesn't get updated. why?
    our_hash[key] = value.gsub(' ', '+')
  }
  businesses_hashes << our_hash  
  our_hash = {}
end

当我打印时,our_hash我可以看到名称已被小写并+已添加,但单词尚未删除。我错过了什么?

4

2 回答 2

4

好吧,它确实会更新,但随后该值会被覆盖。

our_hash[key] = value.downcase! # destructive operation, value mutates in-place
our_hash[key] = (value.split(' ') - REMOVE_WORDS_ARRAY) # remove words and set to hash
our_hash[key] = value.gsub(' ', '+') # use downcased value from the first step, not from the second

注释掉第三行,你会看到。此外,第二行返回一个数组。是不是最后忘记加.join(' ')了?

怎么修?用一种流畅的动作来做:)

our_hash[key] = (value.downcase.split(' ') - REMOVE_WORDS_ARRAY).join('+')
于 2013-01-23T16:32:36.300 回答
1

问题是

  • 您正在用数组替换字符串,并尝试对其进行字符串操作。
  • (正如塞尔吉奥指出的那样,)你要回到原来的value,所以以前的操作变得无关紧要。

还有几个问题。更好的代码是

RemoveWordsRegex = Regexp.union(REMOVE_WORDS_ARRAY.map{|s| /\b#{s}\b/})

businesses_array.each do |name, zip|
  businesses_hashes <<
  {"BusinessName" => name.to_s, "BusinessZipCode" => zip.to_s}
  .values.each{|value|
    value.strip!
    value.downcase!
    value.gsub!(RemoveWordsRegex, "")
    value.gsub!(/\s+/, "+")
  }
end
于 2013-01-23T18:11:39.347 回答