xml - 如何在 Perl 中处理 XML 文件？

Question

我需要使用 Perl 脚本在 XML 文件中获取具有给定名称的节点的子节点数据值。我正在使用XML::LibXML::Simple。

代码片段如下所示：

my $booklist = XMLin(path);

  foreach my $book (@{$booklist->{detail}}) {
    print $book->{name} . "\n";
}

XML 文件如下所示：

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>

当我使用上面的代码时，我收到以下错误消息：“Not an ARRAY reference”

谁能帮帮我吗？

score 2 · Accepted Answer

下面是 OP 中使用的 XML::Simple 的解决方案。

use strict;
use warnings;
use XML::Simple;

my $booklist = XMLin($ARGV[0], KeyAttr => [], ForceArray => qr/detail/);

foreach my $book (@{$booklist->{book}->{detail}}) {
    print $book->{name} . "\n";
}

这里重要的部分是赋予 XMLin 的选项，强制将“详细”子节点表示为一个数组。

XML::Simple 的一个很好的快速入门是 CPAN 上的文档：http: //metacpan.org/pod/XML ::Simple

score 1 · Accepted Answer

当你写：

@{ $booklist->{detail} }

...您是说 $booklist->{detail} 返回一个数组引用，并且您希望 perl 将其取消引用到一个数组中，即“@”。

不要<name>用作标签。XML::Simple 会奇怪地解析它。这是一个例子：

1)

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <bname>book1</bname>
  </book>
  <book>
      <bname>book2</bname>
  </book>
</booklist>

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;



my $booklist = XMLin('xml.xml');
print Dumper($booklist);


--output:--

$VAR1 = {
          'book' => [
                    {
                      'bname' => 'book1'
                    },
                    {
                      'bname' => 'book2'
                    }
                  ]
        };

2) 现在看看使用<name>标签时会发生什么：

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
      <name>book1</bname>
  </book>
  <book>
      <name>book2</bname>
  </book>
</booklist>

--output:--
$VAR1 = {
          'book' => {
                    'book2' => {},
                    'book1' => {}
                  }
        };

所以用你原来的例子：

<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>

    <detail label='label1' status='active' type='none'>
      <bname>book1</bname>
    </detail>

    <detail label='label2' status='active' type='none'>
      <bname>book2</bname>
    </detail>

  </book>
</booklist>


--output:--
$VAR1 = {
          'book' => {
                    'detail' => [
                                {
                                  'bname' => 'book1',
                                  'status' => 'active',
                                  'label' => 'label1',
                                  'type' => 'none'
                                },
                                {
                                  'bname' => 'book2',
                                  'status' => 'active',
                                  'label' => 'label2',
                                  'type' => 'none'
                                }
                              ]
                  }
        };

要获取所有 bname 标签，您可以这样做：

use strict;   
use warnings;   
use 5.016;  

use XML::Simple;
use Data::Dumper;

my $booklist = XMLin('xml.xml');
my $aref = $booklist->{book}{detail};

for my $href (@$aref) {
    say $href->{bname};
}


--output:--
book1
book2

score 1 · Accepted Answer

我觉得是这样的......

use strict;
use XML::Twig;

my $text = join '', <DATA>;
my $story_file = XML::Twig->new(
                twig_handlers =>{
                'name' => \&name,
                keep_atts_order => 1,
},
                pretty_print => 'indented',
);
$story_file->parse($text);

sub name {
        my ($stroy_file, $name) = @_;
    print $name->text, "\n";
}

__END__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
<book>
<detail label='label1' status='active' type='none'>
<name>book1</name>
</detail >
<detail label='label2' status='active' type='none'>
<name>book2</name>
</detail >
</book>
</booklist>

score 1 · Accepted Answer

来自XML::Simple 文档：

不鼓励在新代码中使用此模块。其他模块也可以提供更直接和一致的接口。特别是，强烈推荐使用 XML::LibXML。

这个模块的主要问题是大量的选项以及这些选项交互的任意方式——通常会产生意想不到的结果。

反正。

在您的代码中，您正在浏览书单包含包含详细信息的书籍这一事实。书单没有直接的细节。这是使用XML::LibXML的简短解决方案：

use strict; use warnings; use 5.010; use XML::LibXML;

my $dom = XML::LibXML->load_xml(IO => \*DATA) or die "Can't load";

for my $detail ($dom->findnodes('/booklist/book/detail')) {
    say $detail->findvalue('./name');
}

__DATA__
<?xml version='1.0' encoding='iso-8859-1'?>
<booklist>
  <book>
    <detail label='label1' status='active' type='none'>
      <name>book1</name>
    </detail >
    <detail label='label2' status='active' type='none'>
      <name>book2</name>
    </detail >
  </book>
</booklist>

正如您在 XPATH 表达式中看到的那样/booklist/book/detail，在找到详细信息之前，我们首先必须查看本书。当然，这可以缩短为//detail.

一般来说，如果一个数据结构不是它看起来的那样，你应该转储它，例如

use Data::Dumper;
print Dumper $booklist;

这将输出：

$VAR1 = {
  'book' => {
    'detail' => {
      'book2' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label2'
      },
      'book1' => {
        'status' => 'active',
        'type' => 'none',
        'label' => 'label1'
      }
    }
  }
};

所以出于某种糟糕的原因，book1andbook2字符串现在是嵌套哈希中的键。帮自己一个忙，停止使用 CPAN 上最复杂的 XML 模块，“XML::Simple”。

score 0 · Accepted Answer

使用XML::Rules的另一种方式（假设重点是获取“详细信息”的内容，而不仅仅是打印“名称”的内容）：

use XML::Rules;
my @rules = (
  detail => sub {
    print "$_[1]{name}\n";
    return;
  },
  name => 'content',
  _default => undef,
);

my $xr = XML::Rules->new(rules => \@rules);
$xr->parsefile("tmp.xml");

xml - 如何在 Perl 中处理 XML 文件？

5 回答 5

Related

Reference