0

我有一个问题希望你能帮忙?

foreach my $url ( keys %{$newURLs} ) {
  # first get the base URL and save its content length
  $mech->get($url);
  my $content_length = $mech->response->header('Content-Length');

  # now iterate all the 'child' URLs
  foreach my $child_url ( @{ $newURLs->{$url} } ) {
    # get the content
    $mech->get($child_url);

    # compare
    if ( $mech->response->header('Content-Length') != $content_length ) {
         print "$child_url: different content length: $content_length vs "
         . $mech->response->header('Content-Length') . "!\n";
         #HERE I want to store the urls that are found to have different content 
         #lengths to the base url
         #only if the same url has not already been stored
    } elsif ( $mech->response->header('Content-Length') == $content_length ) {
         print "Content lengths are the same\n";
         #HERE I want to store the urls that are found to have the same content 
         #length as the base url
         #only if the same url has not already been stored
    }
  }
}

我遇到的问题:

正如您在上面的代码中看到的那样,我想根据内容长度是相同还是不同来存储 url,所以我最终会得到一组与其基本 url 内容长度不同的 url,我会结束加上另一组与其基本网址具有相同内容长度的网址。

我知道如何使用数组轻松做到这一点

push (@differentContentLength, $url);
push (@sameContentLength, $url);

但是我将如何使用哈希(或其他首选方法)来解决这个问题?

我仍然在处理哈希,所以你的帮助将不胜感激,

多谢

4

2 回答 2

1

请检查此解决方案:

my %content_length;

foreach my $url ( keys %{$newURLs} ) {
  # first get the base URL and save its content length
  $mech->get($url);
  my $content_length = $mech->response->header('Content-Length');

  # now iterate all the 'child' URLs
  foreach my $child_url ( @{ $newURLs->{$url} } ) {
    # get the content
    $mech->get($child_url); 
    my $new_content_length =  $mech->response->header('Content-Length');
    # store in hash
    print "New URL! url: $child_url\n" if ! defined $content_length{$child_url};
    print "Different content_length! url: $child_url, old_content_length: $content_length, new_content_length: $new_content_length\n" if $new_content_length != $content_length{$child_url};
    $content_length{$child_url} = $new_content_length;
  }
}
于 2013-02-13T11:11:24.690 回答
1

您可以创建一个 hashref 来在循环之外为您存储所有 url。让我们称之为$content_lengths。它是一个标量,因为它是对哈希的引用。在您的$child_url循环中,将内容长度添加到该数据结构中。我们将首先使用基本 url,在里面给我们另一个 hashref $content_lengths->{$url}。在那里我们决定我们是否想要equaldifferent。在这两个键里面会有另一个 hashref 持有$child_urls。它们又将其内容长度作为值。当然++,如果您不想存储长度,我们可以在这里说。

my $content_lengths; # this is at the top
foreach my $url ( # ... more stuff

# compare
if ( $mech->response->header('Content-Length') != $content_length ) {
  print "$child_url: different content length: $content_length vs "
    . $mech->response->header('Content-Length') . "!\n";

  # store the urls that are found to have different content
  # lengths to the base url only if the same url has not already been stored
  $content_lengths->{$url}->{'different'}->{$child_url} = $mech->response->header('Content-Length');

} elsif ( $mech->response->header('Content-Length') == $content_length ) {
  print "Content lengths are the same\n";

  # store the urls that are found to have the same content length as the base
  # url only if the same url has not already been stored
  $content_lengths->{$url}->{'equal'}->{$child_url} = $mech->response->header('Content-Length');
}
于 2013-02-13T12:25:34.177 回答