1

我有以下 XML,其中我有几个带有空文本的子元素。

doc = <<'XML'
<Book>
    <BookId>BK45647</BookId>
    <BookName>The Client by John Grisham</BookName>
    <BookAuthenticationCode></BookAuthenticationCode>
    <BookCategory>Suspense</BookCategory>
    <BookSequence></BookSequence>
    <BookPublisherInfo>
        <PublisherId>PBBK12345</PublisherId>
        <PublisherName>Mc.GrawHill</PublisherName>
        <PublisherIndex></PublisherIndex>
        <PublisherCategoryQuota></PublisherCategoryQuota>
    </BookPublisherInfo>
    <BookPurchaselist>
       <Customer>
           <FirstName>John</FirstName>
           <LastName>Smith</LastName>
           <MiddleName></MiddleName>
           <NickName></NickName>
       </Customer>
        <Customer>
           <FirstName>Winston</FirstName>
           <LastName>Churchill</LastName>
           <MiddleName></MiddleName>
           <NickName></NickName>
       </Customer>
    </BookPurchaselist>
</Book>
XML

我尝试使用下面的代码,但它以某种方式无法正常工作。

cust = doc.at_xpath("//Customer")
cust.each do |cust_obj|
    if cust_obj.has_text? == false
       cust_obj.delete
    end
end

这在某种程度上无法正常工作并给出以下输出

<Book>
    <BookId>BK45647</BookId>
    <BookName>The Client by John Grisham</BookName>
    <BookAuthenticationCode></BookAuthenticationCode>
    <BookCategory>Suspense</BookCategory>
    <BookSequence></BookSequence>
    <BookPublisherInfo>
        <PublisherId>PBBK12345</PublisherId>
        <PublisherName>Mc.GrawHill</PublisherName>
        <PublisherIndex></PublisherIndex>
        <PublisherCategoryQuota></PublisherCategoryQuota>
    </BookPublisherInfo>
    <BookPurchaselist>
       <Customer>
           <FirstName>John</FirstName>
           <LastName>Smith</LastName>
           <MiddleName></MiddleName>
       </Customer>
        <Customer>
           <FirstName>Winston</FirstName>
           <LastName>Churchill</LastName>
           <NickName></NickName>
       </Customer>
    </BookPurchaselist>
</Book>

很少有具有空文本的元素正在获取,并且很少有这样的元素。我如何递归地删除特定 xpath 中的元素(带有空数据)并重新编写 XML。

卡在这里..需要建议。

4

1 回答 1

4
doc.xpath('//Customer/child::*[not(text())]').each do |node|
  node.remove
end

not(node())如果要删除没有子节点的节点,也可以使用。

编辑:完整的工作示例(使用与上面相同的代码)

require 'nokogiri'

xml = <<-XML
<Book>
    <BookId>BK45647</BookId>
    <BookName>The Client by John Grisham</BookName>
    <BookAuthenticationCode></BookAuthenticationCode>
    <BookCategory>Suspense</BookCategory>
    <BookSequence></BookSequence>
    <BookPublisherInfo>
        <PublisherId>PBBK12345</PublisherId>
        <PublisherName>Mc.GrawHill</PublisherName>
        <PublisherIndex></PublisherIndex>
        <PublisherCategoryQuota></PublisherCategoryQuota>
    </BookPublisherInfo>
    <BookPurchaselist>
       <Customer>
           <FirstName>John</FirstName>
           <LastName>Smith</LastName>
           <MiddleName></MiddleName>
       </Customer>
        <Customer>
           <FirstName>Winston</FirstName>
           <LastName>Churchill</LastName>
           <NickName></NickName>
       </Customer>
    </BookPurchaselist>
</Book>
XML

doc = Nokogiri.parse(xml)

doc.xpath('//Customer/child::*[not(text())]').each do |node|
  node.remove
end

puts doc.to_s

这个程序的输出是:

<?xml version="1.0"?>
<Book>
    <BookId>BK45647</BookId>
    <BookName>The Client by John Grisham</BookName>
    <BookAuthenticationCode/>
    <BookCategory>Suspense</BookCategory>
    <BookSequence/>
    <BookPublisherInfo>
        <PublisherId>PBBK12345</PublisherId>
        <PublisherName>Mc.GrawHill</PublisherName>
        <PublisherIndex/>
        <PublisherCategoryQuota/>
    </BookPublisherInfo>
    <BookPurchaselist>
       <Customer>
           <FirstName>John</FirstName>
           <LastName>Smith</LastName>

       </Customer>
        <Customer>
           <FirstName>Winston</FirstName>
           <LastName>Churchill</LastName>

       </Customer>
    </BookPurchaselist>
</Book>
于 2012-07-06T18:41:19.017 回答