arrays - 如何在 Ruby 中将字符串推送到新数组中

Question

我想在给定字符串中搜索子字符串。每次子字符串包含在输入的字符串中时，我都会将其附加到数组中。最终，我希望tally该数组计算每个子字符串出现的次数。

问题是我的代码中字典中的子字符串只添加一次到new_array.

例如：

dictionary = ["below", "down","go","going","horn","how","howdy","it","i","low","own","part","partner","sit"]

substrings("go going", dictionary)

应该输出：

{"go"=>2, "going"=>1, "i"=>1}

但我明白了

{"go"=>1, "going"=>1, "i"=>1}

这是我的代码：

def substrings(word, array) 

  new_array = []

  array.each do |index| 

    if word.downcase.include? (index)

      new_array << index

    end
  end

  puts new_array.tally

end

 dictionary = ["below", "down","go","going","horn","how","howdy","it","i","low","own","part","partner","sit"]

 substrings("go going", dictionary)

score 1 · Accepted Answer

取决于你的字典有多大。

当单词中存在子字符串时，您可以将所有元素与其出现次数进行映射。

dictionary.map {|w| [w,word.scan(w).size] if word.include?(w)}.compact.to_h

score 0 · Accepted Answer

如果我的理解是我们得到一个dictionary不包含空格的单词数组和一个 string str，并且将构造一个哈希，其键是元素，dictionary其值等于非重叠¹的子串的str数量，其中键是子串。返回的散列应该排除具有零值的键。

该答案解决了以下情况：

substrings(str, dictionary)

dictionary大，str不过大（我稍后会详细说明其含义），效率很重要。

我们首先定义一个辅助方法，其目的将变得清晰。

def substr_counts(str)
  str.split.each_with_object(Hash.new(0)) do |word,h|
    (1..word.size).each do |sub_len|
      (0..word.size-sub_len).each do |start_idx|
        h[word[start_idx,sub_len]] += 1
      end
    end
  end
end

对于问题中给出的示例，

substr_counts("go going")
  #=> {"g"=>3, "o"=>2, "go"=>2, "i"=>1, "n"=>1, "oi"=>1, "in"=>1, "ng"=>1,
  #    "goi"=>1, "oin"=>1, "ing"=>1, "goin"=>1, "oing"=>1, "going"=>1}

正如所见，此方法分解str为单词，计算每个单词的每个子字符串并返回一个哈希，其键是子字符串，其值是包含该子字符串的所有单词中不重叠子字符串的总数。

现在可以快速构建所需的哈希。

def cover_count(str, dictionary)
  h = substr_counts(str)
  dictionary.each_with_object({}) do |word,g|
    g[word] = h[word] if h.key?(word)
  end
end

dictionary = ["below", "down", "go", "going", "horn", "how", "howdy", 
              "it", "i", "low", "own", "part", "partner", "sit"]

cover_count("go going", dictionary)
  #=> {"go"=>2, "going"=>1, "i"=>1}

另一个例子：

str = "lowner partnership lownliest"
cover_count(str, dictionary)
  #=> {"i"=>2, "low"=>2, "own"=>2, "part"=>1, "partner"=>1}

这里，

substr_counts(str)
  #=> {"l"=>3, "o"=>2, "w"=>2, "n"=>3, "e"=>3, "r"=>3, "lo"=>2,
  #    ...
  #    "wnliest"=>1, "lownlies"=>1, "ownliest"=>1, "lownliest"=>1} 
substr_counts(str).size
  #=> 109

这里有一个明显的权衡。如果str是长的，特别是如果它包含长词²，构建将花费太长时间来h证明不必为中的每个词确定dictionary该词是否包含在的每个词中的节省是合理的str。但是，如果构建是值得的h，那么总体上节省的时间可能是可观的。

^{1.“不重叠”我的意思是如果str等于'bobobo'它包含一个，而不是两个子字符串'bobo'。}

2.substr_counts("antidisestablishmentarianism").size #=> 385还不错。

score 0 · Accepted Answer

只有字典中的“go”、“going”和“i”是短语的子字符串。这些词中的每一个在字典中只出现一次。那么究竟new_array包含哪一个。["go", "going", "i"]{"go"=>1, "going"=>1, "i"=>1}

我假设您预计go会出现两次，因为在您的短语中出现了两次。在这种情况下，您可以将方法更改为

def substrings(word, array) 
  new_array = []
  array.each do |index| 
    word.scan(/#{index}/).each { new_array << index }
  end
  puts new_array.tally
end

word.scan(/#{index}/)返回短语中每次出现的子字符串。

score 0 · Accepted Answer

其他选项是在拆分单词后使用Array#product ，因此您可以根据需要使用 Enumerable#Tally：

word = "go going"
word.split.product(dictionary).select { |a, b| a.include? b }.map(&:last).tally

#=> {"go"=>2, "going"=>1, "i"=>1}

时输出不同word = "gogoing"，因为它被拆分为一个元素数组。所以，我不能说这是否是你正在寻找的行为。

score 0 · Accepted Answer

您必须计算字符串出现在索引中的次数，因此请使用scan：

def substrings(word, array) 

  hash = {}

  array.each do |index| 
    if word.downcase.include? (index)
      new_hash = {index => word.scan(/#{index}/).length}; 
      hash.merge!(new_hash) 
    end
  end

  puts hash 

end

score 0 · Accepted Answer

您可以使用scan来计算每个子字符串出现的次数。

def substrings(word, array)
  output = {}
  array.each do |index|
     count_substring_appears = word.scan(index).size
     if count_substring_appears > 0
       output[index] = count_substring_appears
     end
  end

  output
end

score 0 · Accepted Answer

我会从这个开始：

dictionary = %w[down go going it i]
target = 'go going'

dictionary.flat_map { |w|
  target.scan(Regexp.new(w, Regexp::IGNORECASE))
}.reject(&:empty?).tally
# => {"go"=>2, "going"=>1, "i"=>1}

arrays - 如何在 Ruby 中将字符串推送到新数组中

7 回答 7

Related

Reference