How to get unique elements of a Perl list

Task. In the program in the programming language Perl has a list. The list is strings and numbers. You need to get unique values from this list.

Solution using the function uniq from the library List::Util

The easiest and most convenient way to solve this problem is to use function uniq from the library List::Util. Here is an example code:

▶ Run
#!/usr/bin/perl

use List::Util qw(uniq);
use Data::Dumper;

my @arr = uniq('one', 'one', 'b', 'one', 'b', 4);

print Dumper \@arr;

If you save this text to a file script.pl, and then execute in the console perl script.pl, it will appear on the screen:

$VAR1 = [
          'one',
          'b',
          4
        ];

Function uniq preserves the order of elements, and it works correctly if the list is undef.

This code will work from Perl 5.26. In earlier versions of Perl contains the version of the library List::Util in which there are no function uniq. So in order to to use this solution you need to either update Perl or to put a more recent version library List::Util.

Official documentation

Here's a snippet of the output perldoc List::Util about the function uniq:

  uniq
        my @subset = uniq @values

    *Since version 1.45.*

    Filters a list of values to remove subsequent duplicates, as judged by a
    DWIM-ish string equality or "undef" test. Preserves the order of unique
    elements, and retains the first value of any duplicate set.

        my $count = uniq @values

    In scalar context, returns the number of elements that would have been
    returned as a list.

    The "undef" value is treated by this function as distinct from the empty
    string, and no warning will be produced. It is left as-is in the
    returned list. Subsequent "undef" values are still considered identical
    to the first, and will be removed.

Own decision

Just enough to write a simplified version of the function uniq from the library List::Util:

▶ Run
#!/usr/bin/perl

use Data::Dumper;

sub uniq {
     my (@values) = @_;

     my %h = map {$_ => 1} @values;

     return keys %h;
}

my @arr = uniq('one', 'one', 'b', 'one', 'b', 4);

print Dumper \@arr;

This code immediately works on any version of Perl. Here are three lines from the subs uniq:

  • my (@values) = @_; — put arguments that are passed to the subwoofer at a variable @values
  • my %h = map {$_ => 1} @values; — for each element of @values has created a list of two elements, the original element and unit, and all the resulting lists together into one list and assign that list to a hash. A hash is an unordered set of key-value. One key can be only one value. Due to this property of hash removed all duplicates.
  • return keys %h; — returned list consists only of keys

But compared with the function uniq from the library List::Util this code has shortcomings:

  • our function uniq returns the elements in arbitrary order. Different runs of the same code lead to different results (values returned by a function uniq will be the same, but the order in which they are located will be different)
  • our function does not work correctly if the list is undef. We use all values from the list as hash keys. And undef can't be a key in the hash. When you try to use undef as a key and will be used an empty string instead undef.

Other articles

Comments