Sample solutions and discussion
Perl Quiz of The Week #11 (20030206)


        Question #1:

        Why does Perl have the 'defined' function?  If you want to see
        if a variable contains an undefined value, why not just use
        something like this this?

                if ($var == undef) { ... }

'==' is for comparing numbers.  If its operands aren't numbers to
begin with, they are converted to numbers before being compared.  The
'undef' on the right is always converted to 0, so this test is that
same as comparing for numeric equality with 0.  In particular, the
test returns true when $var is 0, even though it is not undefined.

The test also fails for many strings:

        $var = "oops";
        if ($var == undef) { die }

This dies even though $var is certainly not undefined.

----------------------------------------------------------------

        Question #2:

        What's wrong with this code?

                %hash = ...;

                while (<STDIN>) {
                  chomp;
                  for my $key (keys %hash) {
                    if ($key eq $_) {
                      print "$key: $hash{$key}\n";
                    }
                  }
                }


The 'for' loop scans the hash looking for a particular key.  But the
whole point of a hash is that you *don't* have to scan it to find out
if it contains a certain key or not.  Hashes are organized so that
Perl can look up any given key instantly, without having to examine
each one.

The code here is analogous to searching the telephone book one name at
a time, starting from the first page, even though the telephone book
is carefully organized (in alphabetical order) so that you don't have
to do that.

A better way to write the code would be:


                %hash = ...;

                while (<STDIN>) {
                  chomp;
                  print "$_: $hash{$_}\n";
                }


This error is common in code written by beginning Perl programmers.
Here's some code that one of my interns once wrote:


        foreach $k (keys %in) {

        if ($k eq q1) {
                if ($in{$k} eq agree) {
                        $count{q10} = $count{q10} + 1;
                }
                if ($in{$k} eq disaagree) {
                        $count{q11} = $count{q11} + 1;
                }
        }
        if ($k eq q2) {
                @q2split = split(/\0/, $in{$k});
                foreach (@q2split) {
                        $count{$_} = $count{$_} + 1;
                }
        }
        if ($k eq q3) {
                $count{$in{$k}} = $count{$in{$k}} + 1;
        }
        if ($k eq q4a) {
                if ($in{$k} eq care) {
                        $count{q4a0} = $count{q4a0} + 1;
                }
                if ($in{$k} eq dontcare) {
                        $count{q4a1} = $count{q4a1} + 1;
                }
        }
        if ($k eq q4b) {
                if ($in{$k} eq wish) {
                        $count{q4b0} = $count{q4b0} + 1;
                }
                if ($in{$k} eq dontwish) {
                        $count{q4b1} = $count{q4b1} + 1;
                }
        }
        if ($k eq q5) {
                if ($in{$k} eq yes) {
                        $count{q50} = $count{q50} + 1;
                }
                if ($in{$k} eq "no") {
                        $count{q51} = $count{q51} + 1;
                }
        }
        if ($k eq q6) {
                if ($in{$k} eq yes) {
                        $count{q60} = $count{q60} + 1;
                }
                if ($in{$k} eq "no") {
                        $count{q61} = $count{q61} + 1;
                }
        }
        if ($k eq q7) {
                if ($in{$k} eq "accept") {
                        $count{q70} = $count{q70} + 1;
                }
                if ($in{$k} eq understand) {
                        $count{q71} = $count{q71} + 1;
                }
                if ($in{$k} eq other) {
                        $count{q72} = $count{q72} + 1;
                        $htmlout = comments;
                        open(COMMENTS, ">> /tmp/comments") || die "cant open comments";
                        print COMMENTS "$in{q7a}\n\n";
                        close (COMMENTS);
                }
        }
        if ($k eq q8) {
                if ($in{$k} eq yes) {
                        $count{q80} = $count{q80} + 1;
                }
                if ($in{$k} eq "no") {
                        $count{q81} = $count{q81} + 1;
                }
        }

        }  #end of foreach loop


Larry Wall, the inventor of Perl, has said:

        Doing linear scans over an associative array is like trying to
        club someone to death with a loaded Uzi.


---------------------------------------------------------------- 

       Question #3:


        What's wrong with this code?

                @matching_words = grep search_for($_, $text_file), @words;

                sub search_for {
                  my ($target, $file) = @_;
                  return unless open F, "<", $file;
                  while (<F>) {
                    return 1 if index($_, $target) >= 1;
                  }
                  close F;
                  return;
                }


There are several things wrong with the code.  Probably the biggest
problem is that the search_for function inadvertently destroys the
contents of @words.

Inside a 'grep' loop or a 'foreach' loop with no control variable, the
$_ variable is 'aliased' to the elements of the array.  This means
that you can look at $_ to see the current array element, and also
that you can modify $_ to modify the current array element.  A simpler
example is:

        @n = (1,2,3);
        for (@n) {
          $_ = 'blah';
        }

        print "@n\n";

This prints "blah blah blah".

Since $_ is a global variable, the assignment to $_ inside the
'search_for' function overwrites the aliased values in @words.

Other possible criticisms include:  (a) search_for performs a repeated
search that is probably wasteful; it would be better to convert it
into a hash lookup of some sort, if possible.  (b) If the rest of the
program happened to have a filehandle named 'F',  calling search_for
will close it.  For example, this doesn't work:

        open F, "myfile" or die ...;
        if (search_for("carrot", "otherfile")) { ... }
        my $next = <F>;

because F has been closed by 'search_for'.   This is a violation of
function encapsulation rules.  If the program who had F open before is
not the same as the one who wrote search_for, this is going to create
a bug that will be very difficult to track down.





