trim in Perl

In some programming languages there is a built-in function trim() which allows you to remove spaces from the beginning and from the end of a string. For example, if there is a string ' asdf ', the function trim() make it 'asdf'.

The Perl programming language has no built-in function trim(), but it is possible to remove spaces from the beginning and from the end of a string without it. Here is an example of a Perl program:

▶ Run
#!/usr/bin/perl

use utf8;
use strict;
use warnings;
use Data::Dumper;

my $str = ' asdf   ';

$str =~ s/^\s+|\s+$//g;

print Dumper $str;

The output of this program:

$VAR1 = 'asdf';

How does it work

This program uses regular expression.

Here is a simple example of a regular expression, that changes the line: $str =~ s/BEFORE/AFTER/;:

  • $str — this is a variable that you want regular expression to be applied to
  • =~ — this operator applies regular expression situated on the right to what is situated on the left
  • s/BEFORE/AFTER/ — this is a regular expression that makes substitute s/, it replaces BEFORE to AFTER

In our example, the regular expression is slightly more complicated: s/^\s+|\s+$//g; It says that everything that meets ^\s+|\s+$ should be replaced by an empty line. This is modifier /g, which says that you need to perform substitution globally on the string.

^\s+|\s+$ consists of the following fragments:

  • ^ — means beginning of the line
  • \s+ — means one or more whitespace character (\s means a whitespace character, and the sign + means one or more)
  • | — means logical or
  • $ — means the end of the line

So ^\s+|\s+$ means all whitespace characters in the beginning of the line or all whitespace characters at the end of the line. And all that symbols are replaced by an empty string.

Characters

Regular expression $str =~ s/^\s+|\s+$//g; removes not only spaces at the beginning and the end, but removes all whitespace. The tab character (\t), the next line character (\n), all of them will be removed. Here's an example:

▶ Run
#!/usr/bin/perl

use utf8;
use strict;
use warnings;
use Data::Dumper;

my $str = " \t \n asdf \n\n \t ";

$str =~ s/^\s+|\s+$//g;

print Dumper $str;

In Unicode there are many characters that indicate spaces. \s means any of these characters.

Libraries

There are several Perl libraries which allow you to remove the initial and trailing spaces from a string, for example:

But these libraries are not shipped with Perl, you need to install them. There are situations that it is more convenient not to use any of that Libraries, but to write your own code to remove the spaces from the beginning and from the end of the line.

Other articles