5

my question is: how to pass some arguments to XML:Twig's handler, and how to return the result from the handler.

Here is my code, which hardcoded:

<counter name = "music", report type = "month", stringSet index = 4>.

How to implement this by using arguments $counter_name, $type, $id? and how to return the result of string_list? Thanks (sorry I did not post the xml file here because I have some trouble to do that. anything within < and > are ignored).

use XML::Twig;

sub parse_a_counter {

     my ($twig, $counter) = @_;
     my @report = $counter->children('report[@type="month"]');

     for my $report (@report){

         my @stringSet = $report->children('stringSet[@index=”4”]');
         for my $stringSet (@stringSet){

             my @string_list = $stringSet->children_text('string');
             print @string_list;  #  in fact I want to return this string_list,
                                  #  not just print it.
         }
     }

     $counter->flush; # free the memory of $counter
}

my $roots = { 'counter[@name="music"]' => 1 };

my $handlers = { counter => \&parse_a_counter };

my $twig = new XML::Twig(TwigRoots => $roots,
                         TwigHandlers => $handlers);

$twig->parsefile('counter_test.xml');
4

3 回答 3

4

将参数传递给处理程序的最简单且常用的方法是使用闭包。这是一个大词但简单的概念:您这样调用处理程序tag => sub { handler( @_, $my_arg) }并将$my_arg传递给处理程序。Achieving Closure对这个概念有更详细的解释。

下面是我将如何编写代码。我用于Getopt::Long参数处理,而qq{}不是包含 XPath 表达式的字符串周围的引号,以便能够在表达式中使用引号。

#!/usr/bin/perl
use strict;
use warnings;

use XML::Twig;

use Getopt::Long;

# set defaults
my $counter_name= 'music';
my $type= 'month';
my $id= 4;

GetOptions ( "name=s" => \$counter_name,
             "type=s" => \$type,
             "id=i"   => \$id,
           ) or die;   

my @results;

my $twig= XML::Twig->new( 
            twig_roots => { qq{counter[\@name="$counter_name"]} 
                             => sub { parse_a_counter( @_, $type, $id, \@results); } } )
                   ->parsefile('counter_test.xml');

print join( "\n", @results), "\n";

sub parse_a_counter {

     my ($twig, $counter, $type, $id, $results) = @_;
     my @report = $counter->children( qq{report[\@type="$type"]});

     for my $report (@report){

         my @stringSet = $report->children( qq{stringSet[\@index="$id"]});
         for my $stringSet (@stringSet){

             my @string_list = $stringSet->children_text('string');
             push @$results, @string_list;
         }
     }

     $counter->purge; # free the memory of $counter
}
于 2010-07-14T06:25:10.200 回答
1

最简单的方法是__parse_a_counter__返回一个子(即闭包)并将结果存储在全局变量中。例如:

use strict;
use warnings;
use XML::Twig;

our @results;      # <= put results in here

sub parse_a_counter {
    my ($type, $index) = @_;

    # return closure over type & index
    return sub {
        my ($twig, $counter) = @_;
        my @report = $counter->children( qq{report[\@type="$type"]} );

        for my $report (@report) {
            my @stringSet = $report->children( qq{stringSet[\@index="$index"]} );

            for my $stringSet (@stringSet) {
                my @string_list = $stringSet->children_text( 'string' );
                push @results, \@string_list; 
            }
        }
    };
}

my $roots    = { 'counter[@name="music"]' => 1 };
my $handlers = { counter => parse_a_counter( "month", 4 ) };

my $twig = XML::Twig->new(
    TwigRoots    => $roots,                     
    TwigHandlers => $handlers,
)->parsefile('counter_test.xml');

我使用以下 XML 进行了测试(这是我可以从您的示例 XML 和代码中得出的结果):

<?xml version="1.0" encoding="UTF-8"?>
<root>
    <counter name="music">
        <report type="week">
            <stringSet index="4">
                <string>music week 4</string>
            </stringSet>
        </report> 
    </counter>
    <counter name="xmusic">
        <report type="month">
            <stringSet index="4">
                <string>xmusic month 4</string>
            </stringSet>
        </report> 
    </counter>
    <counter name="music">
        <report type="month"> 
            <stringSet index="4">
                <string>music month 4 zz</string>
                <string>music month 4 xx</string>
            </stringSet>
        </report>
    </counter>
</root>

我得到了这个:

[
    [
        'music month 4 zz',
        'music month 4 xx'
    ]
];

这是我所期待的!

于 2010-07-12T10:29:00.950 回答
1

免责声明:我自己没有使用过 Twig,所以这个答案可能不是惯用的——它是一个通用的“我如何在回调处理程序中保持状态”的答案。

将信息传入和传出处理程序的三种方式是:

一。处于静态位置的状态

package TwigState;

my %state = ();
# Pass in a state attribute to get
sub getState { $state{$_[0]} }
 # Pass in a state attribute to set and a value 
sub setState { $state{$_[0]} = $_[1]; }

package main;

sub parse_a_counter { # Better yet, declare all handlers in TwigState
     my ($twig, $element) = @_;
     my $counter = TwigState::getState('counter');
     $counter++;
     TwigState::setState('counter', $counter);
}

二。在某个“状态”成员中的 $t(XML::Twig 对象)本身中保存的状态

# Ideally, XML::Twig or XML::Parser would have a "context" member 
# to store context and methods to get/set that context. 
# Barring that, simply make one, using a VERY VERY bad design decision
# of treating the object as a hash and just making a key in that hash.
# I'd STRONGLY not recommend doing that and choosing #1 or #3 instead,
# unless there's a ready made context data area in the class.
sub parse_a_counter {
     my ($twig, $element) = @_;
     my $counter = $twig->getContext('counter');
     # BAD: my $counter = $twig->{'_my_context'}->{'counter'};
     $counter++;
     TwigState::setState('counter', $counter);
     $twig->setContext('counter', $counter);
     # BAD: $twig->{'_my_context'}->{'counter'} = $counter;
}

# for using DIY context, better pass it in with constructor:
my $twig = new XML::Twig(TwigRoots    => $roots,
                         TwigHandlers => $handlers
                         _my_context  => {});

三。使处理程序成为一个闭包并让它保持这种状态

于 2010-07-12T10:00:48.283 回答