Perl's versatile split function
I love Perl’s split function. Far more powerful than its feeble cousin join, split has some wonderful features that should make it a regular feature of any Perl programmer’s toolbox. Let’s look at some examples.
Split a sentence into words
To split a sentence into words, you might think about using a whitespace regex pattern like /\s+/
which splits on contiguous whitespace. Split will ignore trailing whitespace, but what if the input string has leading whitespace? A better option is to use a single space string: ' '
. This is a special case where Perl emulates awk and will split on all contiguous whitespace, trimming any leading or trailing whitespace as well.
my @words = split ' ', $sentence;
Or loop through each word and do something:
use 5.010;
say for (split ' ', ' 12 Angry Men ');
# 12
# Angry
# Men
The single-space pattern is also the default pattern for split
, which by default operates on $_
. This can lead to some seriously minimalist code. For example if I needed to split every name in a list of full names and do something with them:
for (@full_names)
{
for (split)
{
# do something
}
}
And who says Perl looks like line noise?
Create a char array
To split a word into separate letters, just pass an empty regex //
to split:
my @letters = split //, $word;
Parse a URL or filepath
It’s tempting to reach for a regex when parsing strings, but for URLs or filepaths split
usually works better. For example if you wanted to get the parent directory from a filepath:
my @directories = split '/', '/home/user/documents/business_plan.ods';
my $parent_directory = $directories[-2];
Here I split the filepath on slash and use the negative index -2
to get the parent directory. The challenge with filepaths is that they can have n depth, but the parent directory of a file will always be the last but one element of a filepath, so split
works well.
Extract only the first few columns from a separated file
How many times have you parsed a comma separated file, but didn’t want all of the columns in the file? Let’s say you wanted the first 3 columns from a file, you might do it like this:
while <$read_file>
{
my @columns = split /,/;
my $name = $columns[0];
my $email = $columns[1];
my $account = $columns[2];
...
}
This is all well and good, but split
can return a limited number of results if you want:
while <$read_file>
{
my ($name, $email, $account) = split /,/;
...
}
Or to revisit an earlier example, splitting on whitespace:
for (@full_names)
{
my ($firstname, $lastname) = split;
...
}
Conclusion
These are just a few examples of Perl’s versatile split
function. Check out the official documentation online or via the terminal with $ perldoc -f split
.
This article was originally posted on PerlTricks.com.
Tags
David Farrell
David is the editor of Perl.com. An organizer of the New York Perl Meetup, he works for ZipRecruiter as a software developer, and sometimes tweets about Perl and Open Source.
Browse their articles
Feedback
Something wrong with this article? Help us out by opening an issue or pull request on GitHub