Perl hash in scalar context

Sometimes when working with hashes, you can see some strange fractions.

Here is a sample program:

#!/usr/bin/perl

my %h = (
    a => 1,
    b => 2,
);

print "hash: " . %h;

If you run this program on Perl 5.22, the output of this program will be text hash: 2/8.

What is this fraction?

When you use the operator point the hash is used in scalar context. print "hash: " . %h; is the same as print "hash: " . scalar(%h);

String 2/8 is the result of the expression scalar(%h).

String 2/8 talks about the internal structure of the hash. This usage statistics buckets. This is a completely internal kitchen, which is very rarely needed. Inside the hash allocated special sections called buckets in which data is stored. What is written to the right of the fraction is the number of allochronic buckets. The number to the left of the fraction is the number actually used buckets. When you add value pairs in a hash at some point allochronic the number of buckets is increased to the hash worked effectively.

Here is a sample program that shows how these figures change when adding values to a hash:

#!/usr/bin/perl

use feature qw(say);

my %h;

foreach my $i (1 .. 17) {
    $h{$i} = $i;
    say "$i - " . scalar(%h);
}

Here is the output of this program (note that not every time you add a value in the hash increases the number of buckets):

1 - 1/8
2 - 2/8
3 - 3/8
4 - 3/8
5 - 4/8
6 - 4/8
7 - 4/8
8 - 5/16
9 - 5/16
10 - 6/16
11 - 6/16
12 - 6/16
13 - 6/16
14 - 7/16
15 - 8/16
16 - 12/32
17 - 12/32

It is interesting that these figures are unstable. If you repeatedly run this Perl script, the output will be slightly different.

Changing work starting from Perl 5.26

The internal statistics of the hash is needed only rarely. Therefore, starting from the Perl version 5.26, the behavior of the hash in scalar context was changed. Now the hash in scalar context returns a number — the number of pairs of elements in the hash. And frankly, that is what one would expect to obtain when you access a hash in scalar context.

To version 5.26 Perl to get the number of pairs of elements in the hash needed to write scalar(keys(%h));. Perl 5.26 now this number can be obtained using scalar(%h);.

But if someone needs to get a fraction with the statistics for the hash, it is possible to do using function bucket_ratio from the library Hash::Util. Here is an example of a program that on the Perl version 5.26 displays the text hash: 2/8.

#!/usr/bin/perl

use Hash::Util qw(bucket_ratio);

my %h = (
    a => 1,
    b => 2,
);

print "hash: " . bucket_ratio(%h);

Additional statistics on the hash

If so, you need to understand what's going on with Chesham, it is possible to use other tools. From Perl 5.22 in the library Hash::Util function appeared bucket_stats_formatted, here's an example its use:

#!/usr/bin/perl

use Hash::Util qw(bucket_stats_formatted);

my %h = (
    a => 1,
    b => 2,
);

print bucket_stats_formatted(\%h);

The output of this program:

Keys: 2 Buckets: 2/8 Quality-Score: 0.94 (Good)
Utilized Buckets: 25.00% Optimal: 25.00% Keys In Collision: 0.00%
Chain Length - mean: 1.00 stddev: 0.00
Buckets              8 [00000011]
Len   0  75.00%      6 [######]
Len   1  25.00%      2 [##]
Keys                 2 [11]
Pos   1 100.00%      2 [##]

Other articles

Comments