1

给定以下代码:

options = {}
optparse = OptionParser.new do |opts|
    opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
      options[:things] = t
    end
end

如果THING1里面有逗号,我怎样才能防止 OptionParser 分裂呢?

样例:./scrit.rb -t 'foo,bar',baz. 在这种情况下,我想要options[:things]的应该是['foo,bar', 'baz']

这甚至可能吗?

4

2 回答 2

1

如果你的跑步:

./scrit.rb -t 'foo,bar',baz

壳牌 ARGV:

["-t", "foo,bar,baz"]

Shell 将 'foo,bar',baz 转换为 foo,bar,baz:

$ strace -e trace=execve ./scrit.rb -t 'foo,bar',baz
execve("./scrit.rb", ["./scrit.rb", "-t", "foo,bar,baz"], [/* 52 vars */]) = 0
execve("/home/scuawn/bin/ruby", ["ruby", "./scrit.rb", "-t", "foo,bar,baz"], [/* 52 vars */]) = 0

您可以使用其他分隔符:

  opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
    options[:things] = t
    options[:things][0] = options[:things][0].split(":")
  end

$ ./scrit.rb -t foo:bar,baz
[["foo", "bar"], "baz"]

或者:

  opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2') do |t|
    options[:things] = t
    options[:things] = options[:things].length == 3 ? [[options[:things][0],options[:things][1]],options[:things][2]] : options[:things]
  end

$ ./scrit.rb -t foo,bar,baz
[["foo", "bar"], "baz"]
于 2011-07-20T02:17:43.957 回答
0

首先,shell 1为以下所有引用变体产生相同的最终值:

./scrit.rb -t 'foo,bar',baz
./scrit.rb -t foo,'bar,baz'
./scrit.rb -t 'foo,bar,baz'
./scrit.rb -t foo,bar,baz
./scrit.rb -t fo"o,b"ar,baz
./scrit.rb -t foo,b\ar,baz
# obviously many more variations are possible

您可以像这样验证这一点:

ruby -e 'f=ARGV[0];ARGV.each_with_index{|a,i|puts "%u: %s <%s>\n" % [i,a==f,a]}'\
 'foo,bar',baz foo,'bar,baz' 'foo,bar,baz' foo,bar,baz fo"o,b"ar,baz foo,b\ar,baz

1我假设一个类似 Bourne 的外壳(一些sh​​变体,如zshbashkshdash等)。


如果你想切换到其他分隔符,你可以这样做:

split_on_semicolons = Object.new
OptionParser.accept split_on_semicolons do |s,|
  s.split ';'
end
⋮
opts.on('-t', '--thing [THING1;THING2]', split_on_semicolons, 'Set THING1, THING2 (semicolon must be quoted to protect it from the shell)') do |t|
  options[:things] = t
end

shell 赋予分号特殊的含义,所以它必须被转义或引用(否则它作为一个无条件的命令分隔符(例如echo foo; sleep 2; echo bar)):

./scrit.rb -t foo,bar\;baz
./scrit.rb -t foo,bar';'baz
./scrit.rb -t 'foo,bar;baz'
# et cetera

指定时完成的“解析”Array几乎完全是基本的str.split(',')(它也会删除空字符串值),因此无法直接指定转义字符。

如果你想坚持使用逗号但引入一个“转义字符”,那么你可以在你的OptionParser#on块中对值进行一些后处理,以将某些值缝合在一起:

# use backslash as an after-the-fact escape character
# in a sequence of string values,
#   if a value ends with a odd number of backslashes, then
#     the last backslash should be replaced with
#     a command concatenated with the next value
#   a backslash before any other single character is removed
# 
# basic unsplit: (note doubled backslashes due to writing these as Ruby values)
#     %w[foo\\ bar baz] => %w[foo,bar baz]
#
# escaped, trailing backslash is not an unsplit:
#     %w[foo\\\\ bar baz] => %w[foo\\ bar baz]
#
# escaping [other, backslash, split], also consecutive unsplits
#     %w[f\\o\\\\o\\ \\\\\\bar\\\\\\ baz] => %w[fo\\o,\\bar\\,baz]

def unsplit_and_unescape(orig_values)
  values = []
  incompleteValue = nil
  orig_values.each do |val|
    incomplete = /\\*$/.match(val)[0].length.odd?
    val.gsub! /\\(.)/, '\1'
    val = incompleteValue + ',' + val if incompleteValue
    if incomplete
      incompleteValue = val[0..-2]
    else
      values << val
      incompleteValue = nil
    end
  end
  if incompleteValue
    raise ArgumentError, 'Incomplete final value'
  end
  values
end
⋮
opts.on('-t', '--thing [THING1,THING2]', Array, 'Set THING1, THING2 (use \\, to include a comma)') do |t|
  options[:things] = unsplit_and_unescape(t)
end

然后你可以像这样从 shell 运行它(反斜杠对 shell 来说也是特殊的,所以它必须被转义或引用2):

./scrit.rb -t foo\\,bar,baz
./scrit.rb -t 'foo\,bar,baz'
./scrit.rb -t foo'\,'bar,baz
./scrit.rb -t "foo\\,bar,baz"
./scrit.rb -t fo"o\\,ba"r,baz
# et cetera

2与Ruby 不同,shell 的单引号完全是字面的(例如不解释反斜杠),因此当您需要嵌入任何其他shell 特殊字符(如反斜杠和双引号)时,它通常是一个不错的选择。

于 2011-07-20T10:29:41.257 回答