因此,首先,我认为要明确您的要求是最大的短语,因为没有更好的词。我在示例数组中看到的最大子字符串实际上是"carflam f"
and " peanut butter"
。ary
如果这是您使用的任何类中的已知数量,请随时更改参数:
def get_array_of_phrases_larger_than(ary, min)
all = []
# Ugly, but this will span the range of possible phrases for each item in the
# array, building them into a one-dimensional array if they meet the minimum
# length requirements
ary.each do |phrase|
words = phrase.split
last = words.length - 1
(0..last).each do |from|
(from..last).each do |to|
p = words[from..to].join(" ")
all << p if p.size > min
end
end
end
# Get a list of all repeated keys
repeated = all.group_by(&:to_s).select { |_, v| v.size > 1 }
keys = repeated.keys
# Get a list of the longest keys, such that we exclude "peanut" and "butter"
# if "peanut butter" exists
longest = repeated.select do |key, _|
keys.select { |k| k.include?(key) }.size == 1
end
# Sort in reverse order by length
longest.keys.sort_by { |k| -k.size }
end
@ary = ["carflam fizz peanut butter", "fizz foo", "carflam foo peanut butter"]
get_array_of_phrases_larger_than @ary, 3
# => ["peanut butter", "carflam", "fizz"]
请注意,这与字符串的来源无关,因此您可能会遇到类似["butter butter", "foo", "baz"]
返回的误报["butter"]
,但我将把它作为练习留给读者。