3

我正在运行一个名为Primer 3的命令行程序。它接受一个输入文件并将数据返回到标准输出。我正在尝试编写一个 Ruby 脚本来接受该输入,并将这些条目放入哈希中。

返回的结果如下。我想在 '=' 符号上拆分数据,以便有这样的东西:

{:SEQUENCE_ID => "example", :SEQUENCE_TEMPLATE => "GTAGTCAGTAGACNAT..etc", :SEQUENCE_TARGET => "37,21" etc }

我还想将键小写,即:

 {:sequence_id => "example", :sequence_template => "GTAGTCAGTAGACNAT..etc", :sequence_target => "37,21" etc }

这是我当前的脚本:

#!/usr/bin/ruby
puts 'Primer 3 hash'

primer3 = {}
while line = gets do
  name, height = line.split(/\=/)
  primer3[name] = height.to_i
end

puts primer3

它正在返回:

Primer 3 hash
{"SEQUENCE_ID"=>0, "SEQUENCE_TEMPLATE"=>0, "SEQUENCE_TARGET"=>37, "PRIMER_TASK"=>0,     "PRIMER_PICK_LEFT_PRIMER"=>1, "PRIMER_PICK_INTERNAL_OLIGO"=>1,  "PRIMER_PICK_RIGHT_PRIMER"=>1, "PRIMER_OPT_SIZE"=>18, "PRIMER_MIN_SIZE"=>15, "PRIMER_MAX_SIZE"=>21, "PRIMER_MAX_NS_ACCEPTED"=>1, "PRIMER_PRODUCT_SIZE_RANGE"=>75, "P3_FILE_FLAG"=>1, "SEQUENCE_INTERNAL_EXCLUDED_REGION"=>37, "PRIMER_EXPLAIN_FLAG"=>1, "PRIMER_THERMODYNAMIC_PARAMETERS_PATH"=>0, "PRIMER_LEFT_EXPLAIN"=>0, "PRIMER_RIGHT_EXPLAIN"=>0, "PRIMER_INTERNAL_EXPLAIN"=>0, "PRIMER_PAIR_EXPLAIN"=>0, "PRIMER_LEFT_NUM_RETURNED"=>0, "PRIMER_RIGHT_NUM_RETURNED"=>0, "PRIMER_INTERNAL_NUM_RETURNED"=>0, "PRIMER_PAIR_NUM_RETURNED"=>0, ""=>0}

数据源

SEQUENCE_ID=example
SEQUENCE_TEMPLATE=GTAGTCAGTAGACNATGACNACTGACGATGCAGACNACACACACACACACAGCACACAGGTATTAGTGGGCCATTCGATCCCGACCCAAATCGATAGCTACGATGACG
SEQUENCE_TARGET=37,21
PRIMER_TASK=pick_detection_primers
PRIMER_PICK_LEFT_PRIMER=1
PRIMER_PICK_INTERNAL_OLIGO=1
PRIMER_PICK_RIGHT_PRIMER=1
PRIMER_OPT_SIZE=18
PRIMER_MIN_SIZE=15
PRIMER_MAX_SIZE=21
PRIMER_MAX_NS_ACCEPTED=1
PRIMER_PRODUCT_SIZE_RANGE=75-100
P3_FILE_FLAG=1
SEQUENCE_INTERNAL_EXCLUDED_REGION=37,21
PRIMER_EXPLAIN_FLAG=1
PRIMER_THERMODYNAMIC_PARAMETERS_PATH=/usr/local/Cellar/primer3/2.3.4/bin/primer3_config/
PRIMER_LEFT_EXPLAIN=considered 65, too many Ns 17, low tm 48, ok 0
PRIMER_RIGHT_EXPLAIN=considered 228, low tm 159, high tm 12, high hairpin stability 22, ok 35
PRIMER_INTERNAL_EXPLAIN=considered 0, ok 0
PRIMER_PAIR_EXPLAIN=considered 0, ok 0
PRIMER_LEFT_NUM_RETURNED=0
PRIMER_RIGHT_NUM_RETURNED=0
PRIMER_INTERNAL_NUM_RETURNED=0
PRIMER_PAIR_NUM_RETURNED=0
=

$ primer3_core < example2 | ruby /Users/sean/Dropbox/bin/rb/read_primer3.rb
4

3 回答 3

4
#!/usr/bin/ruby
puts 'Primer 3 hash'

primer3 = {}
while line = gets do
  key, value = line.split(/=/, 2)
  primer3[key.downcase.to_sym] = value.chomp
end

puts primer3
于 2013-06-15T11:33:48.220 回答
4

为了好玩,这里有几个纯功能解决方案。两者都假设您已经从文件中提取了数据,例如

my_data = ARGF.read # read the file passed on the command line

这个感觉有点恶心,但它是一个(长)单线:)

hash = Hash[ my_data.lines.map{ |line|
  line.chomp.split('=',2).map.with_index{ |s,i| i==0 ? s.downcase.to_sym : s }
} ]

这是两行,但感觉比使用更干净with_index

keys,values = my_data.lines.map{ |line| line.chomp.split('=',2) }.transpose
hash = Hash[ keys.map(&:downcase).map(&:to_sym).zip(values) ]

与您已经接受的答案相比,这两种方法的效率都可能较低,并且肯定会占用更多内存;迭代行并慢慢改变你的哈希是最好的方法。这些非变异的变化只是一种心理锻炼。


您的最终答案应该用于ARGF在命令行或通过 STDIN 允许文件名。我会这样写:

#!/usr/bin/ruby

module Primer3
  def self.parse( file )
    {}.tap do |primer3|
      # Process one line at a time, without reading it all into memory first
      file.each_line do |line|  
        key, value = line.chomp.split('=', 2)
        primer3[key.downcase.to_sym] = value
      end
    end
  end
end

Primer3.parse( ARGF ) if __FILE__==$0

这样,您既可以从命令行调用该文件,也可以使用或不使用 STDIN,或者您可以require使用此文件并使用它在其他代码中定义的模块函数。

于 2013-06-15T15:45:06.233 回答
-1

好的,我有它(几乎)。唯一的问题是它在每个值的末尾添加一个 \n 。

puts 'Primer 3 hash'

primer3 = {}
while line = gets do
  key, value = line.split(/\=/)
  puts key
  puts value
  primer3[key.downcase] = value
end

puts primer3

{"sequence_id"=>"example\n",  "sequence_template"=>"GTAGTCAGTAGACNATGACNACTGACGATGCAGACNACACACACACACACAGCACACAGGTATTAGTGGGCCATTCGATCCCGACCCAAATCGATAGCTACGATGACG\n", "sequence_target"=>"37,21\n", "primer_task"=>"pick_detection_primers\n", "primer_pick_left_primer"=>"1\n", "primer_pick_internal_oligo"=>"1\n", "primer_pick_right_primer"=>"1\n", "primer_opt_size"=>"18\n", "primer_min_size"=>"15\n", "primer_max_size"=>"21\n", "primer_max_ns_accepted"=>"1\n", "primer_product_size_range"=>"75-100\n", "p3_file_flag"=>"1\n", "sequence_internal_excluded_region"=>"37,21\n", "primer_explain_flag"=>"1\n", "primer_thermodynamic_parameters_path"=>"/usr/local/Cellar/primer3/2.3.4/bin/primer3_config/\n", "primer_left_explain"=>"considered 65, too many Ns 17, low tm 48, ok 0\n", "primer_right_explain"=>"considered 228, low tm 159, high tm 12, high hairpin stability 22, ok 35\n", "primer_internal_explain"=>"considered 0, ok 0\n", "primer_pair_explain"=>"considered 0, ok 0\n", "primer_left_num_returned"=>"0\n", "primer_right_num_returned"=>"0\n", "primer_internal_num_returned"=>"0\n", "primer_pair_num_returned"=>"0\n", ""=>"\n"}
于 2013-06-15T11:32:16.880 回答