Sample solutions and discussion Perl Expert Quiz of The Week #2 (20021023) The local high school baseball team, the Randal Schwartz High School Phoenixes, will be playing a series of games against their rivals, the Richard M. Nixon Memorial High School Growlin' Fungus. The series lasts at most five games, and ends when one team has won three games. You want to bet $80 on the Phoenixes to win the series, but your bookie will only take bets on individual games. (The bookie pays even money on all bets.) A mathematically-inclined friend solves the problem for you, giving you the following instructions: Bet $30 on each of the first two games. Bet $20 on the third game if either team has won both of the first two games, $40 otherwise. Bet $40 on the fourth game, if there is one. Bet $80 on the fifth game, if there is one. At the end of the series, you will be ahead by exactly $80 if the Phoenixes have won the series, and behind by exactly $80 if the Growlin' Fungus have won. We could summarize the instructions in a table like this: If the game score is: 0 to 0, bet $30 1 to 0, bet 30 1 to 1, bet 40 2 to 0, bet 20 2 to 1, bet 40 2 to 2, bet 80 Write a function which calculates the appropriate bet for any such contest, given the total amount you want to risk on the series, the length of the series, and the current game score. For example, bet(80, 5, 2, 1) should return 40, because if you want to risk $80 on a 5-game series, and one team is presently ahead 2 games to 1, you should bet $40. Similarly bet(1000, 7, 2, 1) should return 375. (That is, if you're trying to bet $1000 on the current World Series baseball match, you need to bet $375 on the outcome of tonight's game. If you started with $1000, and followed this function's advice, you'd have $625 left if you had been betting on the Giants and $1375 if you had been betting on the Angels. For people living in places where the World Series is irrelevant: the match is a best-four-of-seven series of games; at present, the Anaheim Angels are beating the San Francisco Giants two games to one, with the fourth game scheduled for tonight.) There are two challenges here: first, how can you figure out the right bet at all, and second, what's a good way to get the computer to calculate it? It seems that some folks made a table of the bets for the Phoenixes - Fungus series: Phoenix Wins: 0 1 2 Fungus 0 30 30 20 Wins 1 30 40 40 2 20 40 80 And these clever folks noted that each value was the average of values just below it and just to the right of it. In fact, the pattern extends further, since you stop betting once one team has 3 wins: Phoenix Wins: 0 1 2 3 Fungus 0 30 30 20 0 Wins 1 30 40 40 0 2 20 40 80 0 3 0 0 0 In fact, this observation is correct in all cases. But suppose you didn't have this fortunate inspiration? How might you solve the problem? Here's how I solved it. First, I considered the case where each team had won two games; it is the final game of the series. If the Phoenixes win this game, I should have a total of $160 in my pocket; if the Fungus wins, I should have $0. How much must I have before the game starts? Clearly the only possible answer is that I must have $80, and I must bet it all. If I have any amount other than $80, then no bet can leave me with the right amounts in both cases. For example, if I started with $60, I'd have to bet $100 in order to have the right amount if the Phoenixes won; but then I'd have -$40 if they lost, which is wrong. Now suppose the Fungus have won 1 game and the Phoenixes have won 2. How much money must I have? If the Phoenixes win another game, they win the series and I want to have $160. If the Fungus wins another game, the series will be tied 2-2 and I should have exactly $80 left, by the previous paragraph. Clearly, the only way this can happen is if I start with $120 and bet $40 of it. Similarly I can work out the amount of money I must have at each stage. Let's let [a, b] represent the situation in which the Phoenixes have won a games and the Fungus have won b games, and let's say that the amount of money I must have in situation [a, b] is the 'value' of [a, b]. Thus the value of [3, 0] is $160; the value of [0, 3] is $0, and the value of [2, 2] is $80. To find out the value [a, b], I just average the values of [a+1, b] and [a, b+1]. The value of [a, b] must be exactly halfway between the value of [a+1, b] and the value of [a, b+1], because if I bet $X, then value([a+1, b ]) = value([a, b]) + $X value([a , b+1]) = value([a, b]) - $X so therefore, by simple algebra: value([a+1, b ]) + value([a , b+1]) = 2 * value([a, b]) and the bet, $X, is value([a+1, b]) - value([a, b]). Here's the first code I wrote to apply these ideas: sub bet { my ($amount, $length, $a, $b) = @_; return value($amount, $length, $a+1, $b) - value($amount, $length, $a, $b); } sub value { my ($amount, $length, $a, $b) = @_; return $amount*2 if $a > $length / 2; # Phoenixes win return 0 if $b > $length / 2; # Fungus win return (value($amount, $length, $a+1, $b) + value($amount, $length, $a , $b+1))/2; } The 'value' function computes the value of a particular situation; 'bet' computes the amount I should bet. 1. With the algebraic observations above, one can find a simpler solution: bet([a, b]) = value([a+1, b]) - value([a, b]) = (value([a+2, b ]) + value([a+1, b+1]))/2 - (value([a+1, b ]) + value([a , b+1]))/2 = ((value([a+2, b ]) + value([a+1, b ])) - (value([a+1, b+1]) + value([a , b+1])))/2 = (bet([a+1, b]) + bet([a, b+1]))/2 which is what the clever folks I mentioned earlier were able to obtain through inspiration. The resulting simpler code is: sub bet { my ($amount, $length, $a, $b) = @_; return 0 if $a > $length/2 || $b > $length/2; # series over return $amount if $a + $b == $length - 1; # series tied at final game return (bet($amount, $length, $a+1, $b) + bet($amount, $length, $a, $b+1))/2; } 2. Although the code here is simple and straightforward, and works well for the examples, it fails badly for longer tournaments. As Piers Cawley pointed out, if you try to use it to compute bets on the outcome of the final round of the World Snooker Tournament, in which Peter Ebdon beat Stephen Hendry in a best-18-of-35-frames struggle, it takes a long, long time. I estimate about 111 hours on my computer. The problem is too much redundant recursion, and as usual, an easy solution is to memoize it: use Memoize; memoize 'bet'; sub bet { # ... as before ... } It now quickly computes that your bet in the first frame should be 13 pounds 58 pence if you want to risk 100 pounds on the outcome of the entire round. 3. The memoization approach happily computes the entire bet table and stores it internally, but a number of people preferred to compute it manually. Here's Michael Carman's solution: { my %tbl; sub bet { my ($amount, $games, $won, $lost) = @_; my $s = int($games / 2); unless (exists($tbl{$s})) { my @t; foreach my $i (reverse (0 .. $s)) { foreach my $j (reverse (0 .. $s)) { if ($i == $s && $j == $s) { $t[$i][$j] = 1; } elsif ($i == $s) { $t[$i][$j] = $t[$i][$j+1] / 2; } elsif ($j == $s) { $t[$i][$j] = $t[$i+1][$j] / 2; } else { $t[$i][$j] = ($t[$i+1][$j] + $t[$i][$j+1]) / 2; } } } $tbl{$s} = \@t; } return $amount * $tbl{$s}->[$won][$lost]; } } Michael realized that you don't have to compute a different table for every possible bet amount; it's enough to compute the betting tables for a $1 total risk, and then if you want to risk $X instead, just multiply all the bets by X. Applying the same improvement to the 'Memoize' solution yields: use Memoize; memoize 'bet1'; sub bet { my $total_risk = shift; $total_risk * bet1(@_); } sub bet1 { my ($length, $a, $b) = @_; return 0 if $a > $length/2 || $b > $length/2; # series over return $amount if $a + $b == $length - 1; # series tied at final game return (bet1($length, $a+1, $b) + bet1($length, $a, $b+1))/2; } 4. John Macdonald then observed that you don't need to compute a different betting table for each different length series, as Michael was doing, since: If you index the table with the number of games each team needs to win, rather than the number they have won so far, one table works for all situations. (For example, a one-game "series" has the same characteristics as the final game of any longer series that goes to the limit. Similarly, if the score is 2 games to 1 in a 7 game series, it is at exactly the same position as a 11 game series that has reached 4 games to 3.) John provided a 'needwinner' function that computed the table out to a specified number of games; then his 'bet' function was simple: sub bet { my( $amt, $games, $schwartz, $nixon ) = @_; my $winner = int( ($games+1) / 2 ); die "invalid args" unless $schwartz < $winner && $nixon < $winner; needwinner( $winner ); return $amt * $bet[$winner-$schwartz][$winner-$nixon]; } 5. While this was going on, John also noticed that the bet amounts obey a very simple mathematical formula. Supposing that the total risk is $1, the bet table is computed by putting $1 in the upper left corner, and then each entry is half the sum of the two entries just above it and jut to the left of it. Except for the 'half' part, this is exactly a recipe for computing "Pascal's Triangle", a well-known mathematical object. The 'half' part just as the effect of multiplying the Pascal's Triangle entries by 2**(-n-1), where n is the number of games left to play in the series. Pascal's Triangle looks like this: 1 1 1 1 1 ... 1 2 3 4 ... 1 3 6 ... 1 4 ... 1 ... ... The bets for the Phoenixes-Fungus series look like this: 1 1/2 1/4 1/2 1/2 3/8 1/4 3/8 3/8 The resemblance isn't obvious unless you rewrite the bets a little bit: 1/1 1/2 1/4 1/2 2/4 3/8 1/4 3/8 6/16 The numerators are the corresponding numbers from Pascal's Triangle; the denominators are successive powers of 2. The powers of 2 are of course trivial, and the entries in Pascal's Triangle are also easy to compute: the value with n+1 games left to play where the Phoenixes need to win k+1 more games is called choose(n, k), and it's equal to n! / (k! * (n-k)!) where ! represents the factorial function. John's code based on this observation is simple and efficient, even without memoization: sub bet { my ($amount, $games, $won, $lost) = @_; my $s = int($games / 2); my $n = 2 * $s - $won - $lost; my $k = $s - $won; return $amount * n_choose_k($n, $k) / 2 ** $n; } sub n_choose_k { my ($n, $k) = @_; return factorial($n) / (factorial($k) * factorial($n - $k)); } sub factorial { return 1 if $_[0] <= 1; return $_[0] * factorial($_[0] - 1); } 6. One problem with John's implementation is that the n! / (k! * (n-k)!) formula is not a very good way to compute entries from Pascal's Triangle. The reason is that factorials are very large, while the Triangle entries are not so large; when you compute n! / (k! * (n-k)!) you're computing a very large number divided by another very large number, and the computer tends to lose precision. Even though choose(52, 26) fits in a 32-bit integer, the intermediate values of 52! and 26! are way too big. Should you ever need to compute entries from Pascal's Triangle, here's the way to do it: sub choose { my ($n, $k) = @_; my $t = 1; for my $i (1 .. $k) { $t *= $n - $k + $i; $t /= $i; } $t; } $t here is always an integer, and never gets bigger than necessary. John later posted a solution that fixed this problem; without this fix, the program produces slightly inaccurate results for series of 33 games and longer. 7. While I was working on this earlier this month, I realized that you can use the same techniques to manufacture unusual bets. For example, suppose you'd like to bet $1000 that the World Series lasts a full seven games. But your bookie won't take such a bet. No problem! VALUE Angels |Giants wins Wins |0 1 2 3 | 4 +--------------------------------+---- 0 |1000 1000 800 400 | 0 | | 1 |1000 1200 1200 800 | 0 | | 2 | 800 1200 1600 1600 | 0 | | 3 | 400 800 1600 3200 |3200 --+--------------------------------+---- 4 | 0 0 0 3200 | As before, the bet in situation [a, b] is just value([a+1, b]) - value([a, b]). So in the first game, you bet nothing; in the second game you must bet $200 on the team that lost the first game; in the third game you bet $400 on the team that lost the first two games, or, if each team lost one game, you bet nothing; and so on. Note that you must start with $1000, and you risk losing it all against the chance of winning $3200; this means that the odds of the World Series lasting seven games are 3200:1000, which is a probability of 23.8%. (Assuming that the teams are equally matched.) Similarly, if you'd like to manufacture a betting schedule that pays off $37 if your team wins in exactly six games, pays of $19 if your team wins in 4, 5, or 7 games, and loses $23 if your team loses, you can do that. 8. The problem was originally posed to me by someone who expects to go work for Goldman-Sachs, and investment bank. There is a very serious application of this problem. Suppose you want to risk a total of $1,000,000 in the stock market over the next year; you want to bet that IBM stock goes up. Unfortunately, you can't make such a bet. What you can do is buy and sell IBM stock each day for the next year. If you buy more stock, you're increasing your bet on IBM to win; if you sell stock, you are reducing your bet. When the stock goes up, that's like your team winning, because your stock is now more valuable; when the stock goes down, your team lost and you lose the bet. By adjusting your holdings appropriately, you can arrange whatever balance of payoffs you like. You can construct a buy-or-sell schedule that pays off if, and only if, IBM stock goes down between 5 and 10 percent. Such a schedule will have at least a few hundred elements, and perhaps thousands, since you can fine-tune it to tell you what to do every hour. So Piers's comment about the 35-frame World Snooker Championship finals isn't so far-fetched! Thanks again to all the subscribers, and to those who participated in the discussion. I will send another quiz on Wednesday.