Why does encoding boost "use of uninitialized value within @_"?
With Perl v5.14.2 (provided by Debian Wheezy) this code:
use Encode qw(encode);
no warnings "all";
sub test_encode {
return Encode::encode("utf8", $_[0]);
}
my $a=undef;
my $r=test_encode(substr($a,0,1));
creates an empty string in $r
. I'm fine with that.
With Perl 5.18.2 (Ubuntu 14.04) it seems to produce this output:
Using an uninitialized value within @_ when assigning a list to / usr / lib / perl / 5.18 / Encode.pm, line 147.
(even though warnings are disabled in the main scop, so apparently this is not a warning. EDIT: per the answers, this is definitely a warning):
This list assignment would be in Encode.pm
:
146 sub encode($$;$) {
147 my ( $name, $string, $check ) = @_;
148 return undef unless defined $string;
149 $string .= ''; # stringify;
Fine-tuning the code, if undef
passed encode
instead $_[0]
, it no longer complains. If a $_[0]
copy is passed $_[0]
in temp instead , it doesn't complain either.
My question is, what would be different in Perl between these versions that would explain the new behavior? What exactly does Perl see internally @_
in Encode.pm line 147?
ADDITION: By adding Dump($_[0]);
from Devel::Peek
at the beginning test_encode
, it produces:
Perl 5.14.2:
SV = PVLV (0x23a2c10) at 0x2340998 REFCNT = 1 FLAGS = (GMG, SMG) IV = 0 NV = 0 PV = 0 MAGIC = 0x235f950 MG_VIRTUAL = & PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr (x) TYPE = x TARGOFF = 0 TARGLEN = 0 TARG = 0x235e370 SV = PV (0x233ec20) at 0x235e370 REFCNT = 2 FLAGS = (PADMY, POK, pPOK) PV = 0x23576b0 "" \ 0 CUR = 0 LEN = 16
Perl 5.18.2:
SV = PVLV (0x25c07c0) at 0x2546cb8 REFCNT = 1 FLAGS = (GMG, SMG) IV = 0 NV = 0 PV = 0 MAGIC = 0x2567dd0 MG_VIRTUAL = & PL_vtbl_substr MG_TYPE = PERL_MAGIC_substr (x) TYPE = x TARGOFF = 0 TARGLEN = 1 TARG = 0x256f328 FLAGS = 0 SV = NULL (0x0) at 0x256f328 REFCNT = 2 FLAGS = (PADMY)
Not sure what to think about it, but the part SV
at the end is significantly different, looks like an empty string versus NULL (0x0).
source to share
This substr
is a warning.
substr
warns when its first argument is undefined.
$ perl -we'
my $x;
my $y = substr($x, 0, 1); # Line 3
'
Use of uninitialized value $x in substr at -e line 3.
As of 5.16.0 , the warning now occurs when the substring operation is actually being performed, rather than when called substr
. When substr
used as an lvalue, the actual substring operation is performed when the value is retrieved or stored in the returned scalar.
$ perl -we'
my $x;
my $r = \substr($x, 0, 1);
my $y = $$r; # Line 4
'
Use of uninitialized value in scalar assignment at -e line 4.
A substring operation is performed to resolve the following:
$ perl -wE'$_ = "abc"; substr($_, 0, 1) = "!!!"; say'
!!!bc
Since the warning now occurs when the substring operation is complete, it is the context op in the Encode that determines whether the warning is visible or not.
$ 5.14.2t/bin/perl -e'use warnings; my $r = \substr(my $x, 0, 1); no warnings; my $y = $$r;'
Use of uninitialized value in scalar assignment at -e line 1.
$ 5.14.2t/bin/perl -e'no warnings; my $r = \substr(my $x, 0, 1); use warnings; my $y = $$r;'
$ 5.22.0t/bin/perl -e'use warnings; my $r = \substr(my $x, 0, 1); no warnings; my $y = $$r;'
$ 5.22.0t/bin/perl -e'no warnings; my $r = \substr(my $x, 0, 1); use warnings; my $y = $$r;'
Use of uninitialized value in scalar assignment at -e line 1.
Why did the warning start when the substring operation is actually being performed instead of when it is called substr
? I guess, but it could fix the following and similar problems:
$ perl -wE'
my $x = "def";
my $r = \substr($x, 0, 1);
$x = "abc";
say "<$$r>";
'
<a>
$ 5.14.2t/bin/perl -wE'
my $x;
my $r = \substr($x, 0, 1);
$x = "abc";
say "<$$r>";
'
Use of uninitialized value $x in substr at -e line 4.
<>
$ 5.22.0t/bin/perl -wE'
my $x;
my $r = \substr($x, 0, 1);
$x = "abc";
say "<$$r>";
'
<a>
The prefix substr
with scalar
calls it as an rvalue, although this is not documented.
$ perl -MO=Concise,-exec -e'1 for substr($_, 0, 1)' 2>&1 | grep substr
7 <@> substr[t4] sKM/3
^
This flag causes the special lvalue behaviour.
$ perl -MO=Concise,-exec -e'1 for scalar substr($_, 0, 1)' 2>&1 | grep substr
7 <@> substr[t2] sK/3
You can also force the build.
$ perl -MO=Concise,-exec -e'1 for "".substr($_, 0, 1)' 2>&1 | grep substr
8 <@> substr[t2] sK/3
source to share
Interesting. If you do pretty much the same thing but:
my $a=undef;
my $b = substr($a,0,1);
my $r=test_encode($b);
It works great.
Or:
my $r=test_encode(scalar substr($a,0,1));
So, I have to say that it has to be related to return values โโfrom substr
and context.
eg. @_[0]
not undefined - @_
undefined.
The module Encode
has:
#
# $Id: Encode.pm,v 2.75 2015/06/30 09:57:15 dankogai Exp $
#
package Encode;
use strict;
use warnings;
Which will override your directive no warnings
, but hiding warnings like this is not very desirable anyway. It has been for a while:
2.18 2006/06/03 20:28:48
! bin/enc2xs
overhauled the -C option
- added ascii-ctrl', 'null', 'utf-8-strict' to core
- auto-generated Encode::ConfigLocal no longer use v-string for version
- now searches modules via File::Find so Encode/JP/Mobile is happy
! Byte/Byte.pm CN/CN.pm EBCDIC/EBCDIC.pm JP/JP.pm KR/KR.pm Symbol/Symbol.pm
use strict added; though all they do is load XS, it's
still better a practice
! *.pm
use warnings added to all of them for better practices' sake.
So, I would suggest that you use an older version of Encode when it works.
source to share