0

我很难从 HTML 表中恢复数据。这就是我所拥有的。

use strict; 
use warnings;
use HTML::TreeBuilder;
use HTML::TableExtract qw(tree); #
use WWW::Mechanize;

my $d = 3; 
my $c = 4; 

$te = HTML::TableExtract->new( depth => $d, count => $c ); # , decode => 1, gridmap => 1
$te->parse($mech->content);
print "\nDepth = $d, Count = $c \n\n";
my $table = $te->first_table_found;
my $table_tree = $table->tree();
my @rows = $table->rows();
print "The row count is   : ".$rowcount,"\n";
print "The column count is: ".$colcount,"\n";
foreach my $row (@rows)
{
   my @read_row = $table->tree->row($row);
   foreach my $read (@read_row)
   {
      print $read, "\n";
   }
}

我将此作为错误消息。

"Rows(ARRAY(0x2987ef8)) out of range at test4.pl line 91."

有没有更好的方法来查看表格并获取值。我没有要查找的标题,我查看了HTML::Query但找不到它,或者通过 PPM 和HTML::Element所需的Badger::Base看起来更适合用于表构造。我还在脚本的前面使用了WWW::Mechanize 。

对我上面的代码的任何帮助将不胜感激。

4

1 回答 1

1

对于大多数目的,您实际上并不需要树提取模式。

始终 use strictuse warnings在您编写的每个 Perl 程序的顶部,并尽可能接近它们的第一个使用点声明您的变量。

您的调用$table->rows()返回一个数组引用列表,您可以像这样访问它

my $te = HTML::TableExtract->new(depth => $d, count => $c); # , decode => 1, gridmap => 1
$te->parse($mech->content);
printf "\nDepth = %d, Count = %d\n\n", $d, $c;

my $table = $te->first_table_found;
my @rows = $table->rows;

for my $row (@rows) {
  print join(', ', @$row), "\n";
}
于 2014-04-01T02:06:39.777 回答