Function chomp() in Perl

In the Perl programming language has a built-in function chomp(). You can use this function when working with strings. In the simplest case, the function chomp() removes the last symbol \n AC which is a passed argument in the function.

Here's an example:

▶ Run
#!/usr/bin/perl

my $string = "ASDF\n";

chomp($string);

print "'$string'";

This program displays the text 'ASDF'.

  • If the value of the variable does not end with \n, the value will not change
  • If a variable contains more than one character \n in the end, it will be deleted only character \n

Arguments

If the function chomp is not passed any arguments, the function works with default variable $_:

▶ Run
#!/usr/bin/perl

$_ = "123\n";

chomp;

print "'$_'";

In that case if the variable $_ is undef and used use warnings;, will warning:

▶ Run
#!/usr/bin/perl

use strict;
use warnings;

chomp;
Use of uninitialized value $_ in scalar chomp at script.pl line 6.

In function chomp() , you can pass scalars, arrays, hashes. In the case of the hash the feature will only work with values of keys, the keys themselves will not change.

The argument of the function chomp() must always be variable. If to try to pass this function a string, it will error and program execution will be stopped:

▶ Run
#!/usr/bin/perl

chomp("ASDF\n"); # Error!
Can't modify constant item in chomp at script.pl line 3, near ""ASDF\n")"
Execution of script.pl aborted due to compilation errors.

Return value

Function chomp() always returns an integer greater than or equal to 0. If the function returns 0, this means that there was no replacement. The number of more 0 means how much it changes.

Here is an example of a situation when the function chomp() returns the number 2:

▶ Run
#!/usr/bin/perl

my @arr = ("ASDF\n", "QWERTY\n");
print chomp(@arr);

Here is an example using the return value to fulfill different code if there was a substitute or replacement was not:

▶ Run
#!/usr/bin/perl

use feature qw(say);

my $string = "ASDF\n";

if (chomp($string)) {
    say 'Removed \n';
} else {
    say 'String is unchanged';
}

print "'$string'";

Variable $/

Function chomp() removes the end of line symbol which is contained in the global variable $/. By default, this variable contains the symbol \n. But you can post in this a variable of some other symbol and then the function chomp() will remove it. Here's an example:

▶ Run
#!/usr/bin/perl

$/ = "F";

my $string = "ASDF";

chomp($string);

print "'$string'";

The program will display the text 'ASD'.

In that case if the variable is an empty string ($/ = '';), then chomp() will remove single trailing \n, and all characters \n at the end of the row.

Standard use

Very often a function of chomp() is used to read line by line from file. For processing strings from a file it is often convenient to in these lines is not it was the symbol \n at the end of the line:

#!/usr/bin/perl

use strict;
use warnings;
use utf8;
use open qw(:std :utf8);

my $file_name = 'a.csv';

open FILE, '<', $file_name or die $!;

while my $line (<FILE>) {
    chomp($line);
    print "Parsing line $line";
}

To remove all the characters \n at the end of the line

Usually a function of chomp() removes only one character \n at the end of the line:

▶ Run
#!/usr/bin/perl

my $string = "Line1\nLine2\n\n\n";

chomp($string);

print "'$string'";

The result:

'Line1
Line2

'

There are several ways how you can remove all symbols \n at the end of the line.

Here is an example using regular expressions:

▶ Run
#!/usr/bin/perl

my $string = "Line1\nLine2\n\n\n";

$string =~ s/\n*$//;

print "'$string'";

Another way to remove all trailing spaces is to set the value variable $/ to an empty string:

▶ Run
#!/usr/bin/perl

$/ = '';

my $string = "Line1\nLine2\n\n\n";

chomp($string);

print "'$string'";

The use of regular expressions are better: it is clearer and does not change global variable which can affect code in other parts of the program.

Official documentation

Here is the output of the command perldoc -f chomp:

       chomp VARIABLE
       chomp( LIST )
       chomp   This safer version of "chop" removes any trailing string that
               corresponds to the current value of $/ (also known as
               $INPUT_RECORD_SEPARATOR in the "English" module).  It returns
               the total number of characters removed from all its arguments.
               It's often used to remove the newline from the end of an input
               record when you're worried that the final record may be missing
               its newline.  When in paragraph mode ("$/ = """), it removes
               all trailing newlines from the string.  When in slurp mode ("$/
               = undef") or fixed-length record mode ($/ is a reference to an
               integer or the like; see perlvar) chomp() won't remove
               anything.  If VARIABLE is omitted, it chomps $_.  Example:

                   while (<>) {
                       chomp;  # avoid \n on last field
                       @array = split(/:/);
                       # ...
                   }

               If VARIABLE is a hash, it chomps the hash's values, but not its
               keys.

               You can actually chomp anything that's an lvalue, including an
               assignment:

                   chomp($cwd = `pwd`);
                   chomp($answer = );

               If you chomp a list, each element is chomped, and the total
               number of characters removed is returned.

               Note that parentheses are necessary when you're chomping
               anything that is not a simple variable.  This is because "chomp
               $cwd = `pwd`;" is interpreted as "(chomp $cwd) = `pwd`;",
               rather than as "chomp( $cwd = `pwd` )" which you might expect.
               Similarly, "chomp $a, $b" is interpreted as "chomp($a), $b"
               rather than as "chomp($a, $b)".

Other articles

Comments