0

我有数据转储器将远程托管的 xml 文件输出到本地文本文件中,我得到以下信息:

$VAR1 = {
    'resource' => {
        '005cd410-41d6-4e3a-a55f-c38732b73a24.xml' => {
            'standard' => 'DITA',
            'area' => 'holding',
            'id' => 'Comp_UKCLRONLINE_UKCLR_2000UKCLR0278',
        },
        '003c2a5e-4af3-4e70-bf8b-382d0b4edda1.xml' => {
            'standard' => 'DITA',
            'area' => 'holding',
            'id' => 'Comp_UKCLRONLINE_UKCLR_2000UKCLR0278',
        },  

等等。我想做的是在每个资源中只使用一个/键和值。即选择 ID,然后从中创建一个 url。

我通常会在文件上使用正则表达式并从中提取我需要的信息,但我认为必须有一种更简单/正确的方法,但想不出在搜索中使用的正确术语,因此找不到它.

这是我用来将此输出写入文件的代码:

#-----------------------------------------------
sub request_url {
#-----------------------------------------------
my $useragent = LWP::UserAgent->new;
my $request = HTTP::Request->new( GET => "http://digitalessence.net/resource.xml" );
$resource = $useragent->request( $request );                                            
}


#-----------------------------------------------
sub file_write {
#-----------------------------------------------
open OUT, ">$OUT" or Log_message ("\n$DATE - $TIME - Could not create filelist.doc \t");
Log_message ("\n$DATE - $TIME - Opened the output file");
print OUT Dumper (XML::Simple->new()->XMLin( $resource->content ));
Log_message ("\n$DATE - $TIME - Written the output file");
}

谢谢

4

2 回答 2

2

我不太理解你的问题,但我猜你想从哈希中访问一些数据。

您不需要正则表达式或其他策略;只需“做”您的数据并从您返回的 hassref 中获取值:

以一个简单的衬里为例(假设您的文件名为 `dumper.out`):

perl -Mstrict -wE 'my $hashref = do{ do "dumper.out" }; say $hashref->{resource}{"005cd410-41d6-4e3a-a55f-c38732b73a24.xml"}{id}'

HTH,保罗

于 2011-09-19T15:58:39.450 回答
1

也许你想走一下XML::Simple. 每个资源都在您使用具有数据结构的resource键获得的 ARRAYREF 内。$doc

use XML::Simple;
use LWP;
use Data::Dumper;

my $ua = LWP::UserAgent->new;
my $req = HTTP::Request->new( GET => "http://digitalessence.net/resource.xml" );
my $res = $ua->request( $req );

my $xs         = XML::Simple->new();
my $doc        = $xs->XMLin( $res->content );

printf "resources: %s\n", scalar keys %{ $doc->{ resource } };

foreach ( keys %{ $doc->{ resource } } ) {
    printf "resource => %s, id => %s\n", $_, $doc->{ resource }->{ $_ }->{ id };
}

输出是这样的:

resources: 7
resource => 005cd410-41d6-4e3a-a55f-c38732b73a24.xml, id => Comp_UKCLRONLINE_UKCLR_2000UKCLR0278
resource => 003c2a5e-4af3-4e70-bf8b-382d0b4edda1.xml, id => Comp_UKCLRONLINE_UKCLR_2002UKCLR0059
resource => 0033d4d3-c397-471f-8cf5-16fb588b0951.xml, id => Comp_UKCLRONLINE_UKCLR_navParentTopic_67
resource => 002a770a-db47-41ef-a8bb-0c8aa45a8de5.xml, id => Comp_UKCLRONLINE_UKCLR_navParentTopic_308
resource => 000fff79-45b8-4ac3-8a57-def971790f16.xml, id => Comp_UKCLRONLINE_UKCLR_2002UKCLR0502
resource => 00493372-c090-4734-9a50-8f5a06489591.xml, id => Comp_UKCLRONLINE_COMPCS_2010_10_0002
resource => 004377bf-8e24-4a69-9411-7c6baca80b87.xml, id => Comp_CLJONLINE_CLJ_2002_01_11
于 2011-09-19T16:39:51.063 回答