perl - 如何在 WWW::Mechanize 中获取被跟踪链接的内容？

Question

这是我希望的最后一个问题。我正在使用 $mech->follow_link 尝试下载文件。出于某种原因，虽然保存的文件只是我第一次打开的页面，而不是我想要关注的链接。这是我应该从链接下载文件的正确方法吗？我不想使用 wget。

    #!/usr/bin/perl -w
    use strict;
    use LWP;
    use WWW::Mechanize;
    my $now_string = localtime;
    my $mech = WWW::Mechanize->new();
    my $filename = join(' ', split(/\W++/, $now_string, -1));
    $mech->credentials( '***********' , '************'); # if you do need to supply     server and realms use credentials like in [LWP doc][2]
$mech->get('http://datawww2.wxc.com/kml/echo/MESH_Max_180min/') or die "Error: failed to load the web page";
$mech->follow_link( url_regex => qr/MESH/i ) or die "Error: failed to download content";
$mech->save_content("$filename.kmz");

score 3 · Accepted Answer

尝试的步骤

首先打印您的内容get，以确保您访问的是有效的 HTML 页面
确保您要访问的链接是名为“MESH”的第三个链接（区分大小写？）
从您的第二个打印内容get
打印文件名以确保其格式正确
检查文件是否创建成功

额外的

在任何一种情况下你都不需要除非 - 它会起作用，或者它会死

例子

#!/usr/bin/perl -w

use strict;
use WWW::Mechanize;

   sub main{
   
      my $url    =  qq(http://www.kmzlinks.com);
      my $dest   =  qq($ENV{HOME}/Desktop/destfile.kmz);
      
      my $mech   =  WWW::Mechanize->new(autocheck => 1);
      
      # if needed, pass your credentials before this call
      $mech->get($url);
      die "Couldn't fetch page" unless $mech->success;
      
      # find all the links that have urls to kmz files
      my @links  =  $mech->find_all_links( url_regex => qr/(?:\.|%2E)kmz$/i );
      
      foreach my $link (@links){               # (loop example)

         # use absolute URL path of the link to download file to destination
         $mech->get($link->url_abs, ':content_file' => $dest);
     
         last;                                 # only need one (for testing)
      }     
   }
   
   main();

score 1 · Accepted Answer

1

您确定要使用名为“MESH”的第三个链接吗？

于 2010-07-07T19:26:35.927 回答

score -1 · Accepted Answer

-1

更改if为unless。

于 2010-07-07T18:58:51.683 回答

perl - 如何在 WWW::Mechanize 中获取被跟踪链接的内容？

3 回答 3

尝试的步骤

额外的

例子

Related

Reference