0

我有以下正则表达式和函数来将电子邮件提取到数组中,虽然它正在工作,但对我来说似乎不是最佳选择。对我如何批准这个有什么建议吗?

@emails = []
matches = @text_document.scan(/\+'(\S+@\S+|\{(?:\w+, *)+\w+\}@[\w.-]+)'/i)
matches.each {|m| m[0].split(',').each {|email| @emails << email  }  }

具体来说,我正在寻找比嵌套每个更好的东西。

干杯

编辑为了完全公平,因为我喜欢这两个答案,所以我给了他们一个公平的运行,但由于 concat 更快更短,我将其标记为答案。

require 'benchmark'

CONSTANT = 1
BenchTimes = 1_000_000
EMAILS = "+'one.emaili@domain.com,another.email@domain.se'"

def email
end

def bm_concat
  emails = []
  EMAILS.scan(/\+'(\S+@\S+|\{(?:\w+, *)+\w+\}@[\w.-]+)'/i) do |matches|
    matches.each {|m| emails.concat(m.split(','))}
  end

end

def bm_inject
  emails = []
  EMAILS.scan(/\+'(\S+@\S+|\{(?:\w+, *)+\w+\}@[\w.-]+)'/i) do |matches|
    matches.inject([]) {|arr, mails| emails.concat(mails.split(',')) }
  end

end

Benchmark.bmbm do |bm|
  bm.report("inject:") { BenchTimes.times { bm_inject } }
  bm.report("concat:") { BenchTimes.times { bm_concat } }
end

产生以下输出:

Rehearsal -------------------------------------------
inject:  11.030000   0.060000  11.090000 ( 11.145898)
concat:   9.660000   0.050000   9.710000 (  9.761068)
--------------------------------- total: 20.800000sec

              user     system      total        real
inject:  11.620000   0.060000  11.680000 ( 11.795601)
concat:  10.510000   0.050000  10.560000 ( 10.678999)
4

2 回答 2

1

您可以将其重构matches.each为:

matches.each {|m| @emails.concat(m[0].split(','))}
于 2012-07-31T14:26:59.433 回答
1

使用注入 - http://ruby-doc.org/core-1.9.3/Enumerable.html#method-i-inject

@emails = matches.inject([]) do |emails, input| 
  emails += input.first.split(',')
end

仅供参考,传递给块的变量,电子邮件是指传入的空数组,输入是指迭代时匹配的每个元素。

编辑(如何使用注入):

REGEX = /\+'(\S+@\S+|\{(?:\w+, *)+\w+\}@[\w.-]+)'/i
def bm_inject
  emails = EMAILS.scan(REGEX).inject([]) do |arr, mails| 
    arr.concat mails.first.split(',')
  end
end
于 2012-07-31T16:22:57.173 回答