A few niceties of Perl
I've been programming in Perl for a while, but I've never understood a couple of subtleties about Perl:
Using and setting / unsetting the $ _ variable confuses me. For example why
# ...
shift @queue;
($item1, @rest) = split /,/;
work but (at least for me)
# ...
shift @queue;
/some_pattern.*/ or die();
doesn't seem to work?
Also, I don't understand the difference between iterating through a file using foreach
versus while
. For example, I seem to be getting different results for
while(<SOME_FILE>){
# Do something involving $_
}
and
foreach (<SOME_FILE>){
# Do something involving $_
}
Can anyone explain these subtle differences?
source to share
shift @queue;
($item1, @rest) = split /,/;
If you read it correctly, you seem to think that this is offsetting the element from @queue
to $_
. This is not true.
The value shifted from @queue
just disappears. The next one split
works on what's in $_
(which is invocation independent shift
).
while(<SOME_FILE>){
# Do something involving $_
}
Reading from a file descriptor in a statement while
is special: it is equivalent to
while ( defined( $_ = readline *SOME_FILE ) ) {
This way, you can process even colossal files one at a time.
On the other hand,
for(<SOME_FILE>){
# Do something involving $_
}
will first load the entire file as a list of strings into memory. Try a 1GB file and see the difference.
source to share
Another, albeit subtle, difference between:
while (<FILE>) {
}
and
foreach (<FILE>) {
}
is that while () will change the value of $ _ outside of its scope, while foreach () does $ _ local. For example, the following will die:
$_ = "test";
while (<FILE1>) {
print "$_";
}
die if $_ ne "test";
whereas this won't:
$_ = "test";
foreach (<FILE1>) {
print "$_";
}
die if $_ ne "test";
This becomes more important with more complex scenarios. Imagine something like:
sub func1() {
while (<$fh2>) { # clobbers $_ set from <$fh1> below
<...>
}
}
while (<$fh1>) {
func1();
<...>
}
Personally, I cannot use $ _ for this reason, in addition to this, it is less readable, etc.
source to share
Regarding the second question:
while (<FILE>) {
}
and
foreach (<FILE>) {
}
Have the same functional behavior including installation $_
. The difference is that it while()
evaluates <FILE>
in a scalar context and foreach()
evaluates <FILE>
in a list context. Let's consider the difference between:
$x = <FILE>;
and
@x = <FILE>;
In the first case, it $x
gets the first line FILE
, and in the second case, it @x
gets the entire file. Each entry in @x
is a different line in FILE
.
So, if FILE
very large, you will lose memory tearing it all up at once using foreach (<FILE>)
versus while (<FILE>)
. This may or may not be a problem for you.
The place where it really matters is if FILE
is a pipe handle, as in:
open FILE, "some_shell_program|";
It foreach(<FILE>)
must now wait for completion some_shell_program
before it can enter the loop, or it while(<FILE>)
can read the output some_shell_program
one line at a time and execute in parallel with some_shell_program
.
However, the relationship behavior $_
remains unchanged between the two forms.
source to share
while
checks if the value is true, for
also puts the value in $_
, except in some cases. For example, will <>
set $_
if used in a loop while
.
to get similar behavior:
foreach(qw'a b c'){
# Do something involving $_
}
You must explicitly install $_
.
while( $_ = shift @{[ qw'a b c' ]} ){
# Do something involving $_
}
Better to explicitly specify variables
for my $line(<SOME_FILE>){
}
or better yet
while( my $line = <SOME_FILE> ){
}
which will only read in the file one line at a time.
Also shift
does not install $_
unless you specifically ask for it.
$_ = shift @_;
And it split
works $_
by default. If used in scalar or void context, will be filled @_
.
source to share
Such confusion should be avoided that he believes it is best not to use the implicit $ _ constructs.
my $element = shift @queue;
($item,@rest) = split /,/ , $element;
or
($item,@rest) = split /,/, shift @queue;
also
while(my $foo = <SOMEFILE>){
do something
}
or
foreach my $thing(<FILEHANDLE>){
do something
}
source to share