Tuesday, November 18, 2008

perl: default input and pattern-searching space

Perl has plenty of special variables.
The most usable is probably $_.
$_ stands for default input and pattern-searching space.
$_ implicitly assigned to input streams, subroutine variables, pattern-searching space(when used without an =~ operator).
$_ is a default iterator variable in a foreach loop if no other variable is supplied
The following block

while (<STDIN>)
{
    s/[A-Z]*//g;
    print;
}
is equivalent to
while ($_ = <STDIN>)
{
    $_ =~ s/[A-Z]*//g;
    print $_;
}
$_ is a global variable so this can produce some unwanted side-effects in some cases. The output of the following code
while (<STDIN>)
{
    print;
    last;
}
print;
{
    print;
    while (<STDIN>)
    {  
        s/[A-Z]*//g;
        print;
        last;
    }  
    print;
}
print;
should be
abcABC<<-- my input string
abcABC
abcABC
abcABC
abcABC<<-- my input string
abc
abc
abc
It's possible to declare $_ with my to be relative to the scope of the block(in perl 5.9.1 and later) and using our restores the global scope of the $_.
The output of the this code
while (<STDIN>)
{
    print;
    last;
}
print;
{
    print;
    my $_;
    while (<STDIN>)
    {  
        s/[A-Z]*//g;
        print;
        last;
    }  
    print;
}
print;
should be
abcABC<<-- my input string
abcABC
abcABC
abcABC
abcABC<<-- my input string
abc
abc
abcABC
and with our
while (<STDIN>)
{
    print;
    last;
}
print;
{
    print;
    my $_;
    while (<STDIN>)
    {  
        s/[A-Z]*//g;
        print;
        last;
    }  
    our $_;
    print;
}
print;
should be
abcABC<<-- my input string
abcABC
abcABC
abcABC
abcABC<<-- my input string
abc
abcABC
abcABC
Unfortunately perl 5.10 is not by default in most linux distribution and some workarounds should be done to achieve functionality of my and our with $_.

No comments: