
Sample solutions and discussion
Perl Quiz of The Week #17 (20040526)

Marco Baringer said:

        When I was in elementary school I wasted many an hour playing
        hangman with my friends.

        The Game of Hangman
        --------------------

        The goal of the game is to guess a word with a certain
        (limited) number of guesses. If we fail the "man" gets
        "hanged," if we succeed he is set free. (We're not going to
        discuss the lesson's of life or justice this game teaches to
        the 8 year olds who play it regularly).

        The game starts out with one person (not the player) choosing
        a "mystery" word at random and telling the player how many
        letters the mystery word contains. The player then guesses
        letters, one at a time, and the mystery word's letters are
        filled in until a) the entire word is filled in, or b) the
        maximum number of guesses are reached and the the player loses
        (man is hanged).

        Write a perl program which lets the user play hangman. The
        program should take the following arguments:

                1) the dictionary file to use 
                2) the maximum number of guesses to give the player. 

        The program must then chose a mystery word from the dictionary
        file and print out as many underscores ("_") as there are
        letters in the mystery word. The program will then read
        letters from the user one at a time. After each guess the
        program must print the word with properly guessed letters
        filled in. If the word has been guessed (all the letters
        making up the word have been guessed) then the program must
        print "LIFE!" and exit. If the word is not guessed before the
        maximum number of guesses is reached then the program must
        print "DEATH!" and exit.

        Example interaction:

                % ./hangman /usr/share/dict 5
                        ___
                c
                        ___
                m
                        m__
                d
                        m__
                a
                        ma_
                n
                        LIFE!
                $ ./hangman /usr/share/dict 3
                        ___
                c
                        ___
                m
                        m__
                d
                        DEATH!
                %


        NOTES
        -----

        1) The dictionary file will contain one word per line and use
           only 7-bit ASCII characters. It may contain randomly
           generated words. The dictionary will contain only words
           longer than 1 character. The size of the dictionary may be
           very large.  See

                http://perl.plover.com/qotw/words/

           for sample word lists.

        2) The dictionary file used for the test (or the program for
           generating it) will be made available along with the
           write-up.

        3) If a letter appears more than once in the mystery word, all
           occurrences of that letter must be filled in. So, if the
           word is 'bokonon' and the player guesses 'o' the output
           must be '_o_o_o_'.

        -Marco


----------------------------------------------------------------


Marco seems to have disappeared (I hope you are not ill, Marco!) and I
have a little time before work this morning, so I thought I'd write up
a report.

It was pointed out that the original quiz specification contained an
error: it said "The program should take the following arguments
... the maximum number of guesses to give the player. "  It should
have said "... the maximum number of *incorect* guesses to allow the
player.".  This is how hangman is usually played.  Otherwise, if the
player is allowed only six guesses, and the word contains seven
letters, it is impossible for the player to win.  I think all the
solutions that appeared on the -discuss list corrected this.

(People also pointed out that, as posed, the program will not reveal
the mystery word to a losing player at the end of the game; this is
frustrating and unsportsmanlike.  Most (all?) of the solutions posted
on the -discuss list repaired this defect.)

At least ten hangman games were sent to the -discuss list, from the
following programmers:

        Roger Burton West
        Mark Jason Dominus
        Christian Duehl
        Shlomi Fish
        Tor Fuglerud
        James Edward Gray II
        David Jones
        Fred P.
        Kevin Pfeiffer
        Mike South

Each of these programs seemed at least a little bit odd to me, in
different ways.  I decided to use Pr. Jones's program as this week's
sample solution, partly because it was quite short, and mostly because
it was quite straightforward.  (Pr. Gray's was about the same length,
and was beautifully straightforward, except for one line in the middle
that gave me the willies.  But it's worth looking at.)

        use strict;
        use warnings;

        die "Usage: $0 dictionary_name number_of_guesses\n"
           if @ARGV < 2 or $ARGV[1] !~ /^\d+$/;

        my ( $dictionary, $countdown ) = @ARGV;
        my $word = get_word ( $dictionary );
        my $tried = ' ';

        while ( $countdown ) {
           ( my $grid = $word ) =~ s/[^$tried]/_/g;
           print "LIFE!\n" and exit if $grid eq $word;
           print "$grid\n";
           # print "Used so far:", sort split //, $tried;
           # print "\nGuesses left: $countdown\n";
           chomp ( my $guess = lc <STDIN> );
           next if $guess !~ /^[a-z]$/ or $tried =~ /$guess/;
           $tried .= $guess;
           $countdown-- unless $word =~/$guess/;
        }
        print "DEATH!\n";
        # print "($word)\n";

        sub get_word {
           open my $fh, '<', $_[0] or die "Can't open dictionary file: $!";
           my $choice;
           rand $. < 1 and chomp ( $choice = $_ ) while <$fh>;
           return $choice;
        }

0. I put most of the programs at

        http://perl.plover.com/qotw/misc/e017/

1. The 'get_word' function is worth study if you haven't seen
   something like it before.  It chooses each word from the dictionary
   with equal probability, without knowing in advance how big the
   dictionary is and without ever having more than two words in memory
   at once.  

   Note that something like this will not work properly:

        sub get_word {
           open my $fh, '<', $_[0] or die "Can't open dictionary file: $!";
           my $choice;
           while (<$fh>) {
             chomp;
             $choice = $_ if rand() < 0.5;
           }
           return $choice;
        }

   This has a serious problem: it very strongly favors the words at
   the end of the dictionary file. More than 90% of the time, it emits
   one of the last four words from the file.  

   It's easy to select each word with equal probability if you are
   allowed to read the entire dictionary into memory first:

        chomp(@words = <$fh>);
        return $words[int rand @words];

   but the quiz said "The size of the dictionary may be very large,",
   which most people took to mean that it might not be feasible to
   read the entire dictionary into memory at once.  

   It's also easy to select each word with equal probability if you
   are allowed to make two passes over the dictionary:

        my $n = 0;
        $n++ while <$fh>;
        my $i = int(rand $n);
        seek $fh, 0, 0;
        <$fh> while $i--;
        return scalar(<$fh>);

   but Pr. Jones's code only reads the dictionary once.

   If you have not seen this technique before, you might like to try
   to come up with it yourself before looking at the solution.  It's
   one of those things that can seem impossible at first until someone
   tells you that it is possible, and then it is not so hard to find
   the answer.

   The technique is explained on page 281 of "Perl Cookbook", if you
   are stumped; I am pretty sure it also appears in Volume II of "The
   Art of Computer Programming".

2. The example program tracks a string, '$tried', which contains all
   the letters that the player has guessed so far; it is initialized
   to contain a space, to avoid an abberant edge case.  The program's
   main loop generates '$grid', which is the display shown to the
   player, by copying it from '$word', the mystery word, and then
   replacing all the letters that are not in '$tried' with
   underscores.  When the player makes a guess, the guess is appended
   to $tried, and the number of guesses remaining is decremented if
   the guess is not present in the secret word.

3. Pr. Burton West's program has an interesting innovation that I
   don't remember having seen before. %mask initially maps every
   letter of the alphabet to '_', and $mask{$letter} is set to $letter
   if the player guesses $letter.  

   @letters is the letters of the mystery word, one letter per
   element.  The program generates the displayed partially-guessed
   word with

        join('',@mask{@letters});

   The hash slice is what interested me here.  The mapping expressed
   by @mask is actually a function of the letter itself, not of the
   letter's position in the word; otherwise this wouldn't work.

4. Pr. Pfeiffer's program is in a rather different, more verbose style
   than the ones I mentioned above, but it is transparently clear.

5. The real fun in hangman, at least for me, is actually drawing in
   the little guy on the scaffold.  I wanted to supply a version of
   the program that would do this, but I did something else instead.
   Fortunately for me, Mike South came to the rescue with a
   "Hangman::Victim" package and a subclass with an alternative
   graphic, inappropriately named "Hangwoman::Victim".  (Inappropriate
   only because "hangman" refers to the person performing the
   execution, not to the victim.  Plover Systems Co. is an
   equal-opportunity employer of both executioners and condemned
   persons.)

6. Pr. Duehl's and Pr. Fugelrud's programs appeared to have been
   designed to be difficult to read, so I didn't bother to read them.

7. A few people decided that the hangman program itself was not
   interesting enough, and instead wrote programs that would play the
   other side, supplying guesses and trying to guess someone else's
   secret word.  I mentioned that I had done something similar many a
   while back, but found that the problem of adopting an optimal
   strategy is much more complex than it appears at first.  There is
   an interesting tradeoff that may occur between making a guess that
   is likely to yield the most information and a guess that is likely
   to be correct.

   Pr. Isaacson did not post his code, but did post some tantalizing
   results:

        http://perl.plover.com/~alias/list.cgi?1:mss:1653

   Both of the posted hangman players use the strategy of eliminating
   all the impossible words from the dictionary on each turn, and then
   guessing the letter that appears most often in the remaining
   words.  It occurs to me that a slightly better strategy might be to
   guess the letter that appears in the greatest number of remaining
   words, instead.  (That is, if the pattern is "ki__", and the
   remaining legal words are

         kill kell kind kine king kiss kite kiva kivu

   then you should prefer 'n' to 'l', even though there are four 'l's
   and only three 'n's.

8. To assist these people, Randy Sims posted a program to analyze
   patterns in a dictionary and generate a database:

        http://perl.plover.com/~alias/list.cgi?1:mss:1682

9. My own contribution was a little different.  It is within the spec
   that was posted, but may be rather more difficult to beat than some
   of the other programs that were supplied.  That is because it
   cheats.   (Undetectably, of course.  It's easy to cheat by making
   your secret word "kwyjibo" or "pyrzqxgl" or some such. but such
   cheating is beneath even me.)

   Determining a good cheating strategy turns out to be quite
   difficult, for the same reasons that a good hangman-playing
   strategy is difficult.  There is an interesting tradeoff between
   letting the hangman player guess right, and keeping the pool of
   possible words as large as possible.  I don't think I found a good
   quilibrium for this tradeoff, and I'd welcome discussion about this.

Thanks to everyone who posted to the discussion list, and particularly
to the folks who wrote programs and *didn't* post to the discussion
list.   

A new quiz will be along tomorrow, and I am particularly excited about it.

[ADMIN]: As you know, quizzes are now being supplied by volunteers.
         The time-consuming part of this is writing up the report
         afterwards.  I could easily supply the quizzes myself; what I
         can't do is write up the reports.

         If you wanted to volunteer, but couldn't think of a quiz
         question, let me know, and we can talk it over.  Conversely,
         if you want to write up a report about a quiz that has
         already come out, even if you didn't supply the question,
         please go ahead, and send the report to perl-qotw-submit.

         Thanks.

-----------------------------------------------------------------


Sample solutions and discussion
Perl Quiz of The Week #17 (20040526)

[ This arrived from Marco just a few minutes after I sent out my
  replacement report.  My apologies to Marco for pre-empting him, and
  to anyone on the list who didn't want to receive two reports. -- MJD. 
]




[sorry for the delay]

Quiz Question:

        http://perl.plover.com/~alias/list.cgi?mss:72

Posted Solutions:

- Tor Fuglerud wins the "we don't like no stinckin' whitespace!" prize
  with http://perl.plover.com/~alias/list.cgi?1:mss:1656

- Christian Duhl wins the "pretty source code" prize with
  http://perl.plover.com/~alias/list.cgi?1:mss:1660

- Shlomi Fish wins the "yes, redo is actually usefull for something"
  prize with
  http://perl.plover.com/~alias/list.cgi?1:mss:1663

- Mike South wins the "politcally correct" prize with his
  hang(man|woman)
  solution. http://perl.plover.com/~alias/list.cgi?1:mss:1670 

- Mark Jason Dominus wins the "Bad MotherFucker" prize with his less
  than honest solution
  http://perl.plover.com/~alias/list.cgi?1:mss:1675

- Fred P. wins the "most polished program" prize with
  http://perl.plover.com/~alias/list.cgi?1:mss:1679

- Et al.: do not take offence if you're not in this list, your
  solution, while valid, simply didn't make me smile or pause.

Comments:

The core algorithm was, obviously, very similar among the various
solutions. A set of guessed letters was kept (either in a string or an
array or hash) along with a set of letters-to-guess.

During the discussions regarding this qotw a few people talked about
writing hangman playerssome pretty interesting discussions came out of
this. See the archives for all the gory details.

David Jones posted a player 
http://perl.plover.com/~alias/list.cgi?1:mss:1666

Randy W. Sims wrote a text pattern analyzer, see
http://perl.plover.com/~alias/list.cgi?1:mss:1682

Sample Solution:

This, rather simple, solution uses two data structures, an array of
the letters in the mystery word and a set (implemented as a hash
table) of the letters guessed so far. It simply loops until we either
guess all the letters in the mystery word or we run out of guesses,
updating the set of letters guessed each time.


--=-=-=
Content-Type: application/octet-stream
Content-Disposition: attachment; filename=simple

#!/usr/bin/perl
use strict;
use warnings;

my ($dict, $num_guesses) = @ARGV;

# Slurp the entire dictionary into memory. I know, I know, I
# explicitly said in the spec that this would be a bad idea. It's just
# that I don't often have access to a machine with 2 GB of RAM and I
# wanted to take advantage of it.
open DICT, "<$dict" or die "Can't open $dict: $!";
my @words = <DICT>;
close DICT;

my $word_string = $words[rand($#words)];

chomp $word_string;
# The array of all the letters in the mystery word.
my @word = split //, $word_string;

# The set of characters guessed so far
my %guesses;

# Register a new guess by the player.
sub collect_guess {
    my $guess = shift;
    if (grep { $_ eq $guess } @word) {
        # If they chose a letter in the word get another guess.
        $num_guesses++
    }
    $guesses{$guess} = 1;
}

# Returns a sorted list af all letters guessed so far
sub guesses_so_far {
    sort { $a cmp $b } keys %guesses;
}

# Returns, as a string, the mystery word with unguessed chars
# substituted with '_'
sub word_so_far {
    join '', map { $guesses{$_} ? $_ : "_" } @word;
}

print word_so_far(), "\n";

for (my $i = 0; $i < $num_guesses; $i++) {
    my $guess = <STDIN>;
    chomp $guess;
    collect_guess $guess;

    if (word_so_far() !~ /_/) {
        # no more letters to guess
        print "LIFE!\n";
        exit;
    }

    print word_so_far(), " ", guesses_so_far(), "\n";
}

# If we ever get here the player ran out of guesses.
print "DEATH!\n";

--=-=-=


Hope you had fun, I did.
-- 
-Marco
Ring the bells that still can ring.
Forget your perfect offering.
There is a crack in everything.
That's how the light gets in.
     -Leonard Cohen

--=-=-=--
