perl - 添加哈希键

Question

我正在使用从 0 开始的递增数字键将数据添加到哈希中。键/值很好。当我添加第二个时，第一个键/值对指向第二个。之后的每个添加都会替换第二个键的值，然后再指向它。输出将Dumper是这样的。

$VAR1 = { '0' => { ... } };

添加第一个键/值之后。添加第二个后，我得到

$VAR1= { '1' => { ... }, '0' => $VAR1->{'1} };

添加第三个键/值后，它看起来像这样。

$VAR1 = { '1' => { ... }, '0' => $VAR1->{'1'}, '2' => $VAR1->{'1'} };

我的问题是它为什么这样做？我希望每个键/值都显示在哈希中。当我遍历哈希时，我为每个键/值得到相同的数据。如何摆脱指向第二个添加键的引用指针？

score 4 · Accepted Answer

您正在将每个元素的值设置为对同一哈希的引用。Data::Dumper 只是反映了这一点。

如果您使用 Data::Dumper 作为序列化工具（糟糕！），那么您应该设置$Data::Dumper::Purity为可以处理1的东西eval。

use Data::Dumper qw( Dumper );

my %h2 = (a=>5,b=>6,c=>7);

my %h;
$h{0} = \%h2;
$h{1} = \%h2;
$h{2} = \%h2;

print("$h{0}{c} $h{2}{c}\n");
$h{0}{c} = 9;
print("$h{0}{c} $h{2}{c}\n");

{
   local $Data::Dumper::Purity = 1;
   print(Dumper(\%h));
}

输出：

7 7
9 9
$VAR1 = {
          '0' => {
                   'c' => 9,
                   'a' => 5,
                   'b' => 6
                 },
          '1' => {},
          '2' => {}
        };
$VAR1->{'0'} = $VAR1->{'1'};
$VAR1->{'2'} = $VAR1->{'1'};

另一方面，如果您不是要使用对不同哈希的存储引用，则可以使用

# Shallow copies
$h{0} = { %h2 };  # { ... }   means   do { my %anon = ( ... ); \%anon }
$h{1} = { %h2 };
$h{2} = { %h2 };

或者

# Deep copies
use Storable qw( dclone );
$h{0} = dclone(\%h2);
$h{1} = dclone(\%h2);
$h{2} = dclone(\%h2);

输出：

score 2 · Accepted Answer

您尚未发布用于构建哈希的实际代码，但我认为它看起来像这样：

foreach my $i (1 .. 3) {
    %hash2 = (number => $i, foo => "bar", baz => "whatever");
    $hash1{$i} = \%hash2;
}

（实际上，我猜，在您的实际代码中，您可能正在循环中从文件中读取数据并基于它while (<>)分配值，但循环将用于演示目的。）%hash2foreach

如果您运行上面的代码并%hash1使用 Data::Dumper 转储结果，您将获得输出：

$VAR1 = {
          '1' => {
                   'baz' => 'whatever',
                   'number' => 3,
                   'foo' => 'bar'
                 },
          '3' => $VAR1->{'1'},
          '2' => $VAR1->{'1'}
        };

为什么会这样？嗯，这是因为里面的值%hash1都是指向同一个哈希的引用，即%hash2. 当您在循环中分配新值时%hash2，这些值将覆盖中的旧值%hash2，但它仍然是相同的哈希值。Data::Dumper 只是强调了这一事实。

那么，您该如何解决呢？好吧，有（至少）两种方法。一种方法是替换\%hash2，它提供了对的引用%hash2， with将的内容{ %hash2 }复制到%hash2一个新的匿名哈希中并返回对它的引用：

foreach my $i (1 .. 3) {
    %hash2 = (number => $i, foo => "bar", baz => "whatever");
    $hash1{$i} = { %hash2 };
}

另一种（IMO 更可取的）方法是%hash2在循环中声明为（词法范围的）局部变量，使用my：

foreach my $i (1 .. 3) {
    my %hash2 = (number => $i, foo => "bar", baz => "whatever");
    $hash1{$i} = \%hash2;
}

这样，循环的每次迭代都将创建一个名为的新的不同哈希%hash2，而在先前迭代中创建的哈希将继续%hash1独立存在（因为它们是从引用的）。

顺便说一句，如果你遵循标准的 Perl 最佳实践，你一开始就不会遇到这个问题，特别是：

总是use strict;（和use warnings;）。这将迫使您声明%hash2with my（尽管它不会强迫您在循环内这样做）。
始终在尽可能小的范围内声明局部变量。在这种情况下，由于%hash2仅在循环中使用，因此您应该在循环中声明它，如上所示。

遵循这些最佳实践，上面的示例代码如下所示：

use strict;
use warnings;
use Data::Dumper qw(Dumper);

my %hash1;
foreach my $i (1 .. 3) {
    my %hash2 = (number => $i, foo => "bar", baz => "whatever");
    $hash1{$i} = \%hash2;
}

print Dumper(\%hash1);

正如预期的那样，它将打印：

$VAR1 = {
          '1' => {
                   'baz' => 'whatever',
                   'number' => 1,
                   'foo' => 'bar'
                 },
          '3' => {
                   'baz' => 'whatever',
                   'number' => 3,
                   'foo' => 'bar'
                 },
          '2' => {
                   'baz' => 'whatever',
                   'number' => 2,
                   'foo' => 'bar'
                 }
        };

score 0 · Accepted Answer

当您不发布代码或 Data::Dumper 的实际结果时，很难看出问题所在。

关于 Data::Dumper，您应该知道一件事：当您转储数组或（尤其是）哈希时，您应该转储对它的引用。否则，Data::Dumper 会将其视为一系列变量。另请注意，哈希不会按照您创建它们的顺序保留。我在下面附上了一个例子。确保您的问题与令人困惑的 Data::Dumper 输出无关。

另一个问题：如果您通过顺序键来键入哈希，那么使用数组会更好吗？

如果可以，请编辑您的问题以发布您的代码和实际结果。

use strict;
use warnings;
use autodie;
use feature qw(say);
use Data::Dumper;

my @array = qw(one two three four five);

my %hash = (one => 1, two => 2, three => 3, four => 4);

say "Dumped Array: " . Dumper @array;
say "Dumped Hash: " . Dumper %hash;
say "Dumped Array Reference: " . Dumper \@array;
say "Dumped Hash Reference: " . Dumper \%hash;

输出：

Dumped Array: $VAR1 = 'one';
$VAR2 = 'two';
$VAR3 = 'three';
$VAR4 = 'four';
$VAR5 = 'five';

Dumped Hash: $VAR1 = 'three';
$VAR2 = 3;
$VAR3 = 'one';
$VAR4 = 1;
$VAR5 = 'two';
$VAR6 = 2;
$VAR7 = 'four';
$VAR8 = 4;

Dumped Array Reference: $VAR1 = [
          'one',
          'two',
          'three',
          'four',
          'five'
        ];

Dumped Hash Reference: $VAR1 = {
          'three' => 3,
          'one' => 1,
          'two' => 2,
          'four' => 4
        };

score 0 · Accepted Answer

它这样做的原因是你给它相同的引用相同的哈希。
大概在循环构造中。

这是一个具有这种行为的简单程序。

use strict;
use warnings;
# always use the above two lines until you
# understand completely why they are recommended

use Data::Printer;

my %hash;
my %inner; # <-- wrong place to put it

for my $index (0..5){
  $inner{int rand} = $index; # <- doesn't matter

  $hash{$index} = \%inner;
}

p %hash;

要修复它，只需确保每次循环都创建一个新的哈希引用。

use strict;
use warnings;
use Data::Printer;

my %hash;

for my $index (0..5){
  my %inner; # <-- place the declaration here instead

  $inner{int rand} = $index; # <- doesn't matter

  $hash{$index} = \%inner;
}

p %hash;

如果您只想将数字用于索引，并且它们从 0 开始单调递增，那么我建议您使用数组。
数组会更快，内存效率更高。

use strict;
use warnings;
use Data::Printer;

my @array; # <--

for my $index (0..5){
  my %inner;
  $inner{int rand} = $index;

  $array[$index] = \%inner; # <--
}

p @array;

perl - 添加哈希键

4 回答 4

Related

Reference