3

我有一组 XML 格式的故事。我想解析文件并将每个故事作为散列或 Ruby 对象返回,以便我可以进一步操作 Ruby 脚本中的数据。

Nokogiri是否支持这一点,还是有更好的工具/库可以使用?

XML 文档具有以下结构,通过Pivotal Tracker 的 Web API返回:

<?xml version="1.0" encoding="UTF-8"?>
<stories type="array" count="145" total="145">
  <story>
    <id type="integer">16376</id>
    <story_type>feature</story_type>
    <url>http://www.pivotaltracker.com/story/show/16376</url>
    <estimate type="integer">2</estimate>
    <current_state>accepted</current_state>
    <description>A description</description>
    <name>Receivable index listing will allow selection viewing</name>
    <requested_by>Tony Superman</requested_by>
    <owned_by>Tony Superman</owned_by>
    <created_at type="datetime">2009/11/04 15:49:43 WST</created_at>
    <accepted_at type="datetime">2009/11/10 11:06:16 WST</accepted_at>
    <labels>index ui,receivables</labels>
  </story>
  <story>
    <id type="integer">17427</id>
    <story_type>feature</story_type>
    <url>http://www.pivotaltracker.com/story/show/17427</url>
    <estimate type="integer">3</estimate>
    <current_state>unscheduled</current_state>
    <description></description>
    <name>Validations in wizards based on direction</name>
    <requested_by>Matthew McBoggle</requested_by>
    <created_at type="datetime">2009/11/17 15:52:06 WST</created_at>
  </story>
  <story>
    <id type="integer">17426</id>
    <story_type>feature</story_type>
    <url>http://www.pivotaltracker.com/story/show/17426</url>
    <estimate type="integer">2</estimate>
    <current_state>unscheduled</current_state>
    <description>Manual payment needs a description field.</description>
    <name>Add description to manual payment</name>
    <requested_by>Tony Superman</requested_by>
    <created_at type="datetime">2009/11/17 15:10:41 WST</created_at>
    <labels>payment process</labels>
  </story>
  <story>
    <id type="integer">17636</id>
    <story_type>feature</story_type>
    <url>http://www.pivotaltracker.com/story/show/17636</url>
    <estimate type="integer">3</estimate>
    <current_state>unscheduled</current_state>
    <description>The SMS and email templates needs to be editable by merchants.</description>
    <name>Notifications are editable by the merchant</name>
    <requested_by>Matthew McBoggle</requested_by>
    <created_at type="datetime">2009/11/19 16:44:08 WST</created_at>
  </story>
</stories>
4

5 回答 5

6

您可以利用 ActiveSupport 中的哈希扩展。然后您只需要在 Nokogiri 中解析您的文档,然后将节点集结果转换​​为哈希。此方法将保留属性类型(例如整数、日期、数组)。(当然,如果您使用的是 Rails,则不必要求/包括主动支持或 nokogiri,如果您的环境中有它。我假设这里是纯 Ruby 实现。)

require 'rubygems'
require 'nokogiri'
require 'activesupport'

include ActiveSupport::CoreExtensions::Hash

doc = Nokogiri::XML.parse(File.read('yourdoc.xml'))
my_hash = doc.search('//story').map{ |e| Hash.from_xml(e.to_xml)['story'] }

这将生成一个哈希数组(每个故事节点一个),并根据属性保留类型,如下所示:

my_hash.first['name']
=> "Receivable index listing will allow selection viewing"

my_hash.first['id']
=> 16376

my_hash.first['id'].class
=> Fixnum

my_hash.first['created_at'].class
=> Time
于 2009-11-30T03:53:18.253 回答
2

一种单线解决方案将是这样的:

# str_xml contains your xml
xml = Nokogiri::XML.parse(str_xml)
xml.search('//story').to_a.map{|node| node.children.inject({}){|a,c| a[c.name] = c.text if c.class == Nokogiri::XML::Element; a}}

它返回一个哈希数组:

>> xml.search('//story').to_a.map{|node| node.children.inject({}){|a,c| a[c.name] = c.text if c.class == Nokogiri::XML::Element; a}}
=> [{"id"=>"16376", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/16376", "estimate"=>"2", "current_state"=>"accepted", "description"=>"A description", "name"=>"Receivable index listing will allow selection viewing", "requested_by"=>"Tony Superman", "owned_by"=>"Tony Superman", "created_at"=>"2009/11/04 15:49:43 WST", "accepted_at"=>"2009/11/10 11:06:16 WST", "labels"=>"index ui,receivables"}, {"id"=>"17427", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17427", "estimate"=>"3", "current_state"=>"unscheduled", "description"=>"", "name"=>"Validations in wizards based on direction", "requested_by"=>"Matthew McBoggle", "created_at"=>"2009/11/17 15:52:06 WST"}, {"id"=>"17426", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17426", "estimate"=>"2", "current_state"=>"unscheduled", "description"=>"Manual payment needs a description field.", "name"=>"Add description to manual payment", "requested_by"=>"Tony Superman", "created_at"=>"2009/11/17 15:10:41 WST", "labels"=>"payment process"}, {"id"=>"17636", "story_type"=>"feature", "url"=>"http://www.pivotaltracker.com/story/show/17636", "estimate"=>"3", "current_state"=>"unscheduled", "description"=>"The SMS and email templates needs to be editable by merchants.", "name"=>"Notifications are editable by the merchant", "requested_by"=>"Matthew McBoggle", "created_at"=>"2009/11/19 16:44:08 WST"}]

但是,这会忽略所有 XML 属性,但是您还没有说要如何处理它们... ;)

于 2009-11-23T10:29:30.803 回答
1

我认为你可以坚持这个答案。

一个更简单的可以在这里找到。

于 2009-11-23T03:32:41.050 回答
1

此 xml 由 Rails 的 ActiveRecord#to_xml 方法生成。如果您使用的是 rails,您应该能够使用 Hash#from_xml 来解析它。

于 2009-11-23T03:38:30.553 回答
0

Maybe a Ruby interface to Pivotal API can be better solution for your task, see https://github.com/jsmestad/pivotal-tracker ... then you can get stories as plain Ruby objects like (from docs):

@a_project = PivotalTracker::Project.find(84739)                              
@a_project.stories.all(:label => 'overdue', :story_type => ['bug', 'chore'])
于 2011-07-18T09:54:55.740 回答