4

我正在从返回 xml 的 api 中获取数据,如下所示:

<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>

我是反序列化的新手,但我认为合适的是将这个 xml 解析为一个 ruby​​ 对象,然后我可以像 objectFoo.seriess.series.frequency 一样引用它,它会返回“Quarterly”。

从我在这里和谷歌的搜索来看,在 Ruby(不是 rails)中似乎没有明显的解决方案,这让我觉得我错过了一些相当明显的东西。有任何想法吗?

编辑 我根据 Winfield 的建议设置了一个测试用例。

class Exopenstruct

  require 'ostruct'

  def initialize()  

  hash = {"seriess"=>{"realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "series"=>{"id"=>"GDPC1", "realtime_start"=>"2013-02-01", "realtime_end"=>"2013-02-01", "title"=>"Real Gross Domestic Product, 1 Decimal", "observation_start"=>"1947-01-01", "observation_end"=>"2012-10-01", "frequency"=>"Quarterly", "frequency_short"=>"Q", "units"=>"Billions of Chained 2005 Dollars", "units_short"=>"Bil. of Chn. 2005 $", "seasonal_adjustment"=>"Seasonally Adjusted Annual Rate", "seasonal_adjustment_short"=>"SAAR", "last_updated"=>"2013-01-30 07:46:54-06", "popularity"=>"93", "notes"=>"Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States.\n\nFor more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"}}}

  object_instance = OpenStruct.new( hash )

  end
end

在 irb 中,我加载了 rb 文件并实例化了该类。但是,当我尝试访问属性(例如 instance.seriess)时,我收到:NoMethodError: undefined method `seriess'

如果我遗漏了一些明显的东西,再次道歉。

4

4 回答 4

16

您最好使用标准 XML 到 Hash 解析,例如 Rails 中包含的:

object_hash = Hash.from_xml(xml_string)
puts object_hash['seriess']

如果您不使用 Rails 堆栈,则可以使用 Nokogiri 之类的库来实现相同的行为。

编辑:如果你正在寻找对象的行为,使用 OpenStruct 是一个很好的方法来包装散列:

object_instance = OpenStruct.new( Hash.from_xml(xml_string) )
puts object_instance.seriess

注意:对于深度嵌套的数据,您可能还需要递归地将嵌入的哈希转换为 OpenStruct 实例。IE:如果上面的属性是值的散列,它将是散列而不是 OpenStruct。

于 2013-01-30T18:46:42.603 回答
4

我刚刚开始使用Damien Le Berrigaud 的 HappyMapper 分支,我对此非常满意。您定义简单的 Ruby 类和include HappyMapper. 当您调用 时parse,它使用 Nokogiri 在 XML 中啜饮,您将获得一棵完整的真正 Ruby 对象树。

我用它来解析数兆字节的 XML 文件,发现它既快速又可靠。查看自述文件

一个提示:由于 XML 文件编码字符串有时会撒谎,您可能需要像这样清理您的 XML:

def sanitize(xml)
  xml.encode('UTF-8', 'binary', invalid: :replace, undef: :replace, replace: '')
end

在将其传递给 #parse 方法之前,以避免 Nokogiri 的Input is not proper UTF-8, indicate encoding !错误。

更新

我继续将 OP 的示例转换为 HappyMapper:

XML_STRING = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'

class Series; end;              # fwd reference

class Seriess
  include HappyMapper
  tag 'seriess'

  attribute :realtime_start, Date
  attribute :realtime_end, Date
  has_many :seriess, Series, :tag => 'series'
end
class Series
  include HappyMapper
  tag 'series'

  attribute 'id', String
  attribute 'realtime_start', Date
  attribute 'realtime_end', Date
  attribute 'title', String
  attribute 'observation_start', Date
  attribute 'observation_end', Date
  attribute 'frequency', String
  attribute 'frequency_short', String
  attribute 'units', String
  attribute 'units_short', String
  attribute 'seasonal_adjustment', String
  attribute 'seasonal_adjustment_short', String
  attribute 'last_updated', DateTime
  attribute 'popularity', Integer
  attribute 'notes', String
end

def test
  Seriess.parse(XML_STRING, :single => true)
end

这就是你可以用它做的事情:

>> a = test
>> a.class
Seriess
>> a.seriess.first.frequency
=> "Quarterly"
>> a.seriess.first.observation_start
=> #<Date: 1947-01-01 ((2432187j,0s,0n),+0s,2299161j)>
>> a.seriess.first.popularity
=> 93
于 2013-10-30T14:14:07.053 回答
1

Nokogiri 解决了解析问题。如何处理数据,由你决定,这里我用OpenStruct一个例子:

require 'nokogiri'
require 'ostruct'
require 'open-uri'

doc = Nokogiri.parse open('http://www.w3schools.com/xml/note.xml')

note = OpenStruct.new

note.to = doc.at('to').text
note.from = doc.at('from').text
note.heading = doc.at('heading').text
note.body = doc.at('body').text

=> #<OpenStruct to="Tove", from="Jani", heading="Reminder", body="ToveJaniReminderDon't forget me this weekend!\r\n">

这只是一个预告片,您的问题规模可能会大很多倍。只是给你一个开始工作的优势


编辑:偶然发现 google 和 stackoverflow 我遇到了我的答案和@Winfield使用 rails之间的可能混合Hash#from_xml

> require 'active_support/core_ext/hash/conversions'
> xml = Nokogiri::XML.parse(open('http://www.w3schools.com/xml/note.xml'))
> Hash.from_xml(xml.to_s)
=> {"note"=>{"to"=>"Tove", "from"=>"Jani", "heading"=>"Reminder", "body"=>"Don't forget me this weekend!"}}

然后,您可以使用此哈希来初始化一个新的 ActiveRecord::Base 模型实例或您决定使用它做的任何其他事情。

http://nokogiri.org/
http://ruby-doc.org/stdlib-1.9.3/libdoc/ostruct/rdoc/OpenStruct.html https://stackoverflow.com/a/7488299/1740079

于 2013-01-30T19:06:03.553 回答
0

如果您想将 xml 转换为哈希,我发现紫菜gem 是最简单的。

例子:

require 'nori'

xml = '<?xml version="1.0" encoding="utf-8" ?> <seriess realtime_start="2013-01-28" realtime_end="2013-01-28"> <series id="GDPC1" realtime_start="2013-01-28" realtime_end="2013-01-28" title="Real Gross Domestic Product, 1 Decimal" observation_start="1947-01-01" observation_end="2012-07-01" frequency="Quarterly" frequency_short="Q" units="Billions of Chained 2005 Dollars" units_short="Bil. of Chn. 2005 $" seasonal_adjustment="Seasonally Adjusted Annual Rate" seasonal_adjustment_short="SAAR" last_updated="2012-12-20 08:16:28-06" popularity="93" notes="Real gross domestic product is the inflation adjusted value of the goods and services produced by labor and property located in the United States. For more information see the Guide to the National Income and Product Accounts of the United States (NIPA) - (http://www.bea.gov/national/pdf/nipaguid.pdf)"/> </seriess>'

hash = Nori.new.parse(xml)    
hash['seriess']
hash['seriess']['series']
puts hash['seriess']['series']['@frequency']

注意“@”用于频率,因为它是“系列”的属性而不是元素。

于 2018-02-28T19:14:02.260 回答