1

I'm taking a file and reading in it's contents and creating a hash based on newlines. I've been able to make a hash based on the contents of each line, but how can I create a hash based on the content of everything before the next blank newline? Below is what I have so far.

Input:

Title   49th parallel
URL     http://artsweb.bham.ac.uk/
Domain  artsweb.bham.ac.uk

Title   ABAA booknet
URL     http://abaa.org/
Domain  abaa.org

Code:

File.readlines('A.cfg').each do |line|
  unless line.strip.empty?
    hash = Hash[*line.strip.split("\t")]
    puts hash
  end
  puts "\n" if line.strip.empty?
end

Outputs:

{"Title"=>"49th parallel"}
{"URL"=>"http://artsweb.bham.ac.uk/"}
{"Domain"=>"artsweb.bham.ac.uk"}

{"Title"=>"ABAA booknet"}
{"URL"=>"http://abaa.org/"}
{"Domain"=>"abaa.org"}

Desired Output:

{"Title"=>"49th parallel", "URL"=>"http://artsweb.bham.ac.uk/", "Domain"=>"artsweb.bham.ac.uk"}

{"Title"=>"ABAA booknet", "URL"=>"http://abaa.org/", "Domain"=>"abaa.org"}
4

3 回答 3

1

Modifying your existing code, this does what you want:

hash = {}
File.readlines('A.cfg').each do |line|
  if line.strip.empty?
    puts hash if not hash.empty?
    hash = {}
    puts "\n"
  else
    hash.merge!(Hash[*line.strip.split("\t")])
  end
end

puts hash

You can likely simplify that depending on what you're actually doing with the data.

于 2013-04-29T18:12:09.300 回答
1
open('A.cfg', &:read)
.strip.split(/#$/{2,}/)
.map{|s| Hash[s.scan(/^(\S+)\s+(\S+)/)]}

gives

[
  {
    "Title"  => "49th",
    "URL"    => "http://artsweb.bham.ac.uk/",
    "Domain" => "artsweb.bham.ac.uk"
  },
  {
    "Title"  => "ABAA",
    "URL"    => "http://abaa.org/",
    "Domain" => "abaa.org"
  }
]
于 2013-04-29T18:41:29.230 回答
0

read the whole content of the file using read:

contents = ""
File.open('A.cfg').do |file|
  contents = file.read
end

And then split the contents on two newline characters:

contents.split("\n\n")

And lastly, create a function pretty similar to what you already have to parse those chunks.

Please note that if you are working on windows it may happen that you need to split on a different sequence because of the carriage return character.

于 2013-04-29T18:01:56.990 回答