1
sub parse_xml{
    my $xml_link = $_[0];
    my $xml_content = get($xml_link) or warn "Cant get XML page of " . $xml_link . "\n";
    if(!$xml_content){
        return;
    }
    my $xml =  XML::Simple->new(KeepRoot => 1);
    my $xml_data = $xml->XMLin($xml_content);
    my @items = $xml_data->{rss}{channel}->{item};
   # print Dumper($xml_data);
    foreach my $item (@items) {
        if($item){
             print Dumper($item);             //This is the dump output
             print $item->{author};
             #print $item . "\n";
        }
    }
}

当我尝试输出项目的作者时,我只会得到HASH(Memory Address)not a hash reference at ... line ...

我做错了吗?为什么会产生这个错误?

这是转储程序的输出。

$VAR1 = [
          {
            'link' => 'http://***.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67229',
            'author' => {},
            'title' => 'By: ',
            'pubDate' => 'Tue, 08 Jun 2010 12:47 EDT',
            'description' => 'Interesting. At least SHE remembered the product that propelled her to recent recognition. When many people I know have commented on how they loved that Betty White Super Bowl spot, they can't recall the product. Ah, advertising.'
          },
          {
            'link' => 'http://***.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67167',
            'author' => {},
            'title' => 'By: ',
            'pubDate' => 'Mon, 07 Jun 2010 13:26 EDT',
            'description' => 'Fun, fun, fun. A great attitude for all of us to take into our careers.'
          },
          {
            'link' => 'http://****.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67164',
            'author' => 'username',
            'title' => 'By: username',
            'pubDate' => 'Mon, 07 Jun 2010 12:23 EDT',
            'description' => 'Her appearance of the Comedy Central roast of William Shattner a couple of years ago was great... it seems like her willingness to be irreverent makes her more appealing to us all!  

www.adverspew.com'
          },
          {
            'link' => 'http://****.com/article/news/betty-white-credits-snickers-golden-opportunities/144290/#comments-67142',
            'author' => {},
            'title' => 'By: ',
            'pubDate' => 'Mon, 07 Jun 2010 09:50 EDT',
            'description' => 'Solid interview. I will definitely be tuning into "Hot in Cleveland" next week. We ought to enjoy Ms. White's talents for as long as we have her. She's great!'
          }
        ];
4

2 回答 2

1

你在正确的轨道上。我已经在此 StackOverflow 页面链接的新闻源上使用了您的代码,并对其进行了微调。

use LWP::Simple;
use XML::Simple;
use Data::Dumper;
sub parse_xml{
    my $xml_link = $_[0];
    my $xml_content = get($xml_link) or warn "Cant get XML page of " . $xml_link . "\n";
    if(!$xml_content){
        return;
    }
    my $xml =  XML::Simple->new(KeepRoot => 1);
    my $xml_data = $xml->XMLin($xml_content,ForceArray =>'entry');
    foreach my $item ($xml_data->{'feed'}[0]->{'entry'}) {
        foreach my $entry (@{$item}){
            if($entry){
                print $entry->{'author'}[0]->{'name'}[0]."\n";
                print $entry->{'author'}[0]->{'uri'}[0]."\n";
            }
        }

    }

}
parse_xml('http://stackoverflow.com/feeds/question/10906521');

在那个例子上工作正常。我怀疑您可能试图打印出不是普通值的东西——在 stackoverflow 页面的示例中,您可以看到“作者”实际上包含一些子节点,因此如果您尝试打印 $item ->{'author'} 在 foreach 循环中,您将获得您描述的 'HASH' 结果。

看看你的转储和鲍罗丁的明智评论,这应该对你有用:

   my $xml_data = $xml->XMLin($xml_content,ForceArray =>'entry');
    my $item = $xml_data->{'rss'}[0]->{'channel'}[0]->{'item'};
    foreach my $entry (@{$item}){
        if($entry){
            if(!ref $entry->{'author'}[0]){
                    print $entry->{'author'}[0]."\n";
            }
            if(!ref $entry->{'description'}[0]){
                    print $entry->{'description'}[0]."\n";
            }
            if(!ref $entry->{'pubDate'}[0]){
                    print $entry->{'pubDate'}[0]."\n";
            } # etc.
        }
于 2012-06-06T00:18:34.193 回答
1

此 RSS 提要可能有也可能没有<author>每个项目的信息。

如果没有作者,则该元素仍会出现在 XML 中,但它没有内容。它显示为<author></author>

XML::Simple将其表示为一个空的匿名哈希。

因此,如果有项目的作者信息,$item->{author}将是一个简单的文本字符串。否则它将是一个哈希引用。

您可以为此编写代码

foreach my $item (@items) {
  my $author = $item->{author};
  $author = '' if ref $author;
  print "$item\n";
}
于 2012-06-06T02:01:27.417 回答